Insert global variable declaration with a gcc plugin - c++

I would like to know if it's possible to insert a global variable declaration with a gcc plugin. For example if I have the following code:
test.c:
int main(void) {
return 0;
}
and I want to transform it into:
int fake_var;
int main(void) {
return 0;
}
Is that possible?
If it's possible, in which pass and how can I do it?

I think you'll want to take a look at varpool_add_new_variable() in varpool.c. You should be able to pass a declaration built with type VAR_DECL to it. Similarly, take a look at add_new_static_var(), but I think the former is what you want, as it was added specifically to allow declaring globals in the middle/back end.

Using GCC -D option u can pass a value to a C program.
Eg:
int main()
{
printf("global decl %d\n", gvar);
}
gcc -Dgvar=10 gcc.c
This may give a closest behaviour you are looking for though this is not equivalent to a global variable declaration. This is a macro substitution at compile time.

below there is an example of create a global integer var:
//add_new_static_var, in cgraph.h
tree global_var = add_new_static_var(integer_type_node);
//if you want to name the var:
tree name = get_identifier("my_global_name");
DECL_NAME(global_var) = name;
/*AND if you have thought to use in another subsequent compilation, you
will need to give it an assembler name like this*/
change_decl_assembler_name(global_var, name);
Keep in mind that in another compilation that it's supposed you to link after with a previous compilation you will need to declare the global var too, but you will have to declare all var with DECL_EXTERNAL(global_var) = 1 in all compilation minus the original, and only in one compilation (i.e, the original: the compilation that hold the original var ) you must only to add the propertie TREE_PUBLIC(global_var) = 1

Ehm, no. Don't. Why would you want to? You can't use it.
The closest you can get is to define the variable in a new .c file and link it in separately, but it would still have to be declared (with extern) in test.c for test.c to use it.

you can create a file that contains the code you want to add to the top of the input file and use the -include yourfile option.
that advises the preprocessor to assume an #include "yourfile" at the top of the input file.
see this question:
Include header files using command line option?
But you have to separately build that c file since this file will be added to all compilation units.

Related

Initialize vector of strings from text files at compile time

I'm working on a project that has a directory of files: myFiles.
I need something like:
const vector<string> configs = { contents_of_file0, contents_of_file1, ... };
It is desired that the contents of these files be part of the binary, as opposed to being read at runtime.
Is there a clean way to do this?
Today, there is a massive hack, the contents of all the files are concatenated into a single #define, by a build script.
#define contents "a supper massive string that is too large for some compilers to ingest"
This #define is later parsed at runtime.
I'm looking for a cleaner way...
Is there a clean way to do this?
There is proposed feature for this purpose that may end up in a future standard.
Until then, you can use meta programming: Generate source code from the input file. The generated source should contain the initialiser based on the file. An open source program exists that can do this: xxd
concatenated into a single #define
I don't see an advantage to this. If you want separate strings, then generate separate initialisers. I also don't see a need for a macro.
I also recommend carefully re-considering whether it even makes sense to want this. Loading a massive executable isn't any faster than reading large files, and reading files is more flexible.
This #define is later parsed at runtime. This is the wrong statement, define is a preprocessor, which is done in the very early stage of compiling. And for static strings, it can be parsed in build time since c++17, this is also a consideration solution to generate constepr vector when combined with constexpr vector from c++2a.
May consider using the include hack for simplicity, if you use it please don't forget to set up the dependency for tbl file in makefile:
For the source file:
int main(int argc, char* argv[]) {
std::vector<std::string> vec = {
#include "1.tbl"
#include "2.tbl"
};
for (auto& v : vec) {
std::cout << v << std::endl;
}
return 0;
}
For the table file 1.tbl:
"1","2","3",
For the table file 2.tbl:
"a", "b","c",
We get the output:
1
2
3
a
b
c

Insert string at linking time

I want to define a external global symbol (const char*) used by the program. Add the symbol at linking time with a given value. This is useful for example commit hash or build time.
I found --defsym which does something else. The Go linker supports this functionality via the -X option. (Yeah I know, Go Strings are managed, but I am talking about plain old zero terminated c strings)
For example:
extern const char *git_commit;
int main(int argc, char *argv[]) {
puts(git_commit);
return 0;
}
gcc main.o -Wl<something here that adds git_commit and set it to '84e5...'>
I am aware of config.h approach and building object files containing those string. But its 2019 by know. Such a simple task should be easy.
Edit more precise question
Is there a equivalent option in gcc/binutils for Go Linker's -X option.
There is a change to do it in compile/preprocessing time, consider this:
#include <stdio.h>
const char * git_commit = GIT_COMMIT;
int main(int argc, char ** argv) {
puts(git_commit);
return 0;
}
and in command line:
gcc -o test test.c -DGIT_COMMIT="\"This is a test\""
assuming you are using GCC compiler.
One way:
echo "char const* git_commit = \"$(git rev-parse HEAD)\";" > git_commit.c
gcc -c -o git_commit.o git_commit.c
gcc -o main main.o git_commit.o
When you implement this in a makefile you may like to only recreate git_commit.c when the revision changes, so that it doesn't relink it on each make invokation.
I'm not aware of any way to do what you're seeking to do directly via a flag using one of the commonly used linkers. In general, if you want to link the definition of such an object into your program, you'll have to provide an object file containing a suitable symbol definition.
The probably simplest way to get there would be to just invoke the compiler to compile the definition of your variable with the content being fed from a macro defined via the command line like already suggested in the other answers. If you want to avoid creating temporary source files, gcc can also receive input straight from stdin:
echo "const char git_commit[] = GIT_COMMIT;" | gcc -DGIT_COMMIT=\"asdf\" -c -o git_commit.obj -xc++ -
And in your code your just declare it as
extern const char git_commit[];
Note: I'm using const char git_commit[] rather than const char* git_commit. That way, git_commit will directly be an array of suitable size initialized to hold the contents of the commit hash. const char* git_commit, on the other hand, will create a global pointer object initialized to hold the address of a separate string literal object, which means you introduce an unnecessary indirection. Not that it will really matter here, but it also doesn't really cost you anything to skip the inefficiency, however tiny it might be…
There would also be the objcopy utility which can be used to wrap arbitrary binary content in an object file, see, e.g., here How do I embed the contents of a binary file in an executable on Mac OS X? It may even be possible to pass input to objcopy straight via stdin as well. Finally, you could also just write your own tool that directly writes an object file containing a suitable symbol definition. Consider, however, that, at the end of the day, you're seeking to generate an object file that can be linked with the other object files making up your program. Simply using the same compiler you use to compile the rest of the code is probably the most robust way of going about doing that. With other solutions, you'll always have to manually make sure that the object files are compatible in terms of target architecture, memory layout, …
You could embed it as a ressource file, for example:
GitInfoHTML HTML "GitInfo.html"
This will placed into the executable by the linker (at link time) and can then be loaded.
For more info, see:
https://learn.microsoft.com/en-us/cpp/windows/resource-files-visual-studio?view=vs-2019

C++ __TIME__ is different when called from different files

I encountered this strange thing while playing around with predefined macros.
So basically, when calling __TIME__ from different files, this happens:
Is there anyway I can fix this? Or why does this happen?
All I am doing is printf("%s\n", __Time__); from different functions in different sources.
Or why does this happen?
From the docs:
This macro expands to a string constant that describes the time at which the preprocessor is being run.
If source files are compiled at different times, then the time will be different.
Is there anyway I can fix this?
You could use a command line tool to generate the time string, and pass the string as a macro definition to the compiler. That way the time will be the same for all files compiled by that command.
To answer your original question: __TIME__ is going to be different for different files because it specifies the time when that specific file was compiled.
However, you're asking X-Y problem. To address what you're actually trying to do:
If you need a compilation-time value, you're better off letting your build system specify it. That is, with make or whatever you're using, generate a random seed somehow, then pass that to the compiler as a command-line option to define your own preprocessor macro (e.g. gcc -DMY_SEED=$(random_value) ...). Then you could apply that to all C files that you compile and have each of them use MY_SEED however you want.
Well, I think your use case is kind of weird, but a simple way to get the same time in all files is to use __TIME__ in exactly one source file, and use it to initialize a global variable:
compilation_time.h:
const char *compilation_time;
compilation_time.c:
#include "compilation_time.h"
const char *compilation_time = __TIME__;
more_code.c:
#include "compilation_time.h"
...
printf("%s\n", compilation_time);
If you really want to construct an integer as in your comment (which may be non-portable as it assumes ASCII), you could do
seed.h:
const int seed;
seed.c:
#include "seed.h"
const int seed = (__TIME__[0] - '0') + ...;
more_code.c:
#include "compilation_time.h"
...
srand(seed);

Try to find global variables from compiled files. The program can't distinguish constants from global variables.

Good Day! I'm trying to find the decision for a long time.
My problem is:
For example I have 2 .cpp files, one of them containing
const std::string DICTIONARY_DEFAULT = "blah";
const std::string ADDTODICTIONARY_DEFAULT = "blah";
const std::string BUTTONS = "blah";
and the second one with
static int x1;
static int NewY1, NewY2, NewX1, NewX2;
Both fragments are in the global variables section. I need to print the global static variables (for example), but ignore constants. In nm output they're looking absolutely identical (b-type for every case, which means uninitialized local scope symbol). Is there any way to separate this cases automatically using only linux utilities (grep, regexps and so on are perfectly okay)?
MY TASK FOR BETTER UNDERSTANDING:
There is a program in C++, the main task is to find and to withdraw the list of global variables.
Input data looks like archives with lots of .cpp files. Every .cpp file is syntactically correct program in C++ (It Must successfully compiled using compilier GNU C++ and Microsoft Visual C++).
For every file from the archive I must output in separate string the name of the file and the list of global variables, like in the example:
Output Data :
000000.cpp ancestor ansv cost graph M N p qr query u
000001.cpp
000002.cpp
000003.cpp
000004.cpp
000005.cpp
000006.cpp
000007.cpp edge tree
finding global variables is a 'subject' of this clang tutorial -- in this tutorial author did it 'just for fun', but you may add some code to do exactly what you need... (btw, it is not so hard as one may guess :))
Short answer: There is actually no way to do it in every case
Long answer: Take a look at the SYMBOL TABLE using 'objdump -x file.o'. You can see that all global variables, both static and const, are allocated into a section called .bss. A section called .rodata also exists and it is, generally speaking, used to store const data. Unfortunately, in your case you are declaring two const std::string objects. Those objects are initialized by invoking their constructor before the 'main' function is run. Still, the initialization of their fields happens at run-time and so they are only 'logically' const, and not really const.
The compiler has no choice but to allocate them into the .bss section with all other globals.
If you add the following line
const int willBeInRoData = 42;
You will find that its symbol will be in the .rodata section and so it will be distinguishable from the other global integers.

get the value of a c constant

I have a .h file in which hundreds of constants are defined as macros:
#define C_CONST_NAME Value
What I need is a function that can dynamically get the value of one of these constants.
needed function header :
int getConstValue(char * constName);
Is that even possible in the C langage?
---- EDIT
Thanks for the help, That was quick :)
as i was thinking there is no miracle solution for my needs.
In fact the header file i use is generated by "SCADE : http://www.esterel-technologies.com/products/scade-suite/"
On of the solution i got from #Chris is to use some python to generate c code that does the work.
Now its to me to make some optimizations in order to find the constant name. I have more than 5000 constants O(500^2)
i'm also looking at the "X-Macros" The first time i hear of that, home it works in C because i'm not allowed to use c++.
Thanks
C can't do this for you. You will need to store them in a different structure, or use a preprocessor to build the hundreds of if statements you would need. Something like Cogflect could help.
Here you go. You will need to add a line for each new constant, but it should give you an idea about how macros work:
#include <stdio.h>
#define C_TEN 10
#define C_TWENTY 20
#define C_THIRTY 30
#define IFCONST(charstar, define) if(strcmp((charstar), #define) == 0) { \
return (define); \
}
int getConstValue(const char* constName)
{
IFCONST(constName, C_TEN);
IFCONST(constName, C_TWENTY);
IFCONST(constName, C_THIRTY);
// No match
return -1;
}
int main(int argc, char **argv)
{
printf("C_TEN is %d\n", getConstValue("C_TEN"));
return 0;
}
I suggest you run gcc -E filename.c to see what gcc does with this code.
A C preprocessor macro (that is, something named by a #define statement) ceases to exist after preprocessing completes. A program has no knowledge of the names of those macros, nor any way to refer back to them.
If you tell us what task you're trying to perform, we may be able to suggest an alternate approach.
This is what X-Macros are used for:
https://secure.wikimedia.org/wikipedia/en/wiki/C_preprocessor#X-Macros
But if you need to map a string to a constant, you will have to search for the string in the array of string representations, which is O(n^2).
You can probably do this with gperf, which generates a lookup function that uses a perfect hash function.
Create a file similar to the following and run gperf with the -t option:
struct constant { char *name; int value; };
%%
C_CONST_NAME1, 1
C_CONST_NAME2, 2
gperf will output C (or C++) code that does the lookup in constant time, returning a pointer to the key/value pair, or NULL.
If you find that your keyword set is too large for gperf, consider using cmph instead.
There's no such capability built into C. However, you can use a tool such as doxygen to extract all #defines from your source code into a data structure that can be read at runtime (doxygen can store all macro definitions to XML).