How can I print variables in a specific sharable? - gdb

To simplify things, let's say I have the following in xxx.c:
int foo = 0;
void bar() {
...
}
Conditional compilation inside bar let's me compile in two ways. I compile using the first way, and create a shared library, let's call it lib1.so. Then I compile using the second way, and create another shared library, let's call it lib2.so. Now I run my main program, and I dynamically load (dlopen) both lib1.so and lib2.so. If I ask gdb to "print foo", it will print out a value, but which one is it? I can't qualify foo with a filename ('xxx.c'::foo) because the source name is the same for both sharables. Is there a way to tell gdb to specifically print foo from lib1.so, or foo from lib2.so?
If I set a breakpoint on 'bar', gdb is smart enough to set 2 breakpoints, one in each shareable. So I am a little surprised that "print foo" does not similarly print two values, one for each shareable.

How can I print variables in a specific sharable?
I don't believe there is currently a way to do that. This bug is relevant.
Note: if you have external variable (that is, you didn't use any special flags, such as -fvisibility-hidden or a linker script to mark foo hidden), then all references to foo will bind to whichever library was dlopened first, so only one instance of foo will be used from either instance of bar (this is one major difference between how UNIX shared libraries and Windows DLLs work).
You could of course look up the other instance via dlsym(..., "foo").

Related

AVR G++: Executing a function that is past the 128 Kb ROM boundary

AVR g++ has a pointer size of 16 bits. However, my particular chip (the ATMega2560) has 256 KB of RAM. To support this, the compiler automatically generates trampoline sections in the same section of ROM as the current executing code that then contains the extended assembly code to jump into high memory or back. In order for trampolines to be generated, you must take the address of something that sits in high memory.
In my scenario, I have a bootloader that I have written sitting in high memory. The application code needs to be able to call a function in the bootloader. I know the address of this function and need to be able to directly address it by hard-coding the address in my code.
How can I get the compiler/linker to generate the appropriate trampoline for an arbitrary address?
Compiler and linker will only generate trampoline code when the far address is a symbolic address rather than a literal constant number already in code. something like (assuming the address you want to jump to is 0x20000).
extern void (*farfun)() = 0x20000;
farfun ();
Will definitely not work, it doesn't cause the linker to do anything because the address is already resolved.
You should be able to inject the symbol address in the linker command line like so:
extern void farfun ();
farfun ();
compiling "normally" and linking with
-Wl,--defsym,farfun=0x20000
I think it's clear that you need to make sure yourself that something sensible sits at farfun.
You will most probably also need --relax.
EDIT
Never tried this myself, but maybe:
You could probably try to store the function address in a table in high memory and declare it like this:
extern void (*farfunctable [10])();
(farfunctable [0])();
and use the very same linker command to resolve the external symbol (now your table at 0x20000 (in the bootloader) needs to look like this:
extern void func1();
extern void func2();
void ((*farfunctab [10])() = {
func1,
func2,....
};
I would recommend to put func1() ... func10() in a different module from farfunctab in order to make the linker know it has to generate trampolins.
I was planning on putting a dispatch struct (that is, a struct with function pointers to all the various functions). Your solution works well, but requires knowing all of the locations of all of the functions ahead of time. Is there a way to execute a function call to a far address that isn't known at compile time?
[...] My goal was to put the struct with pointers to the functions in a fixed location. That way, it would be a single thing that needed a fixed address rather than every external function.
So you have two applications, let's call them App and Boot, where Boot provides some functionalities that App wants to use. The following problems have to be addressed:
How to get addresses from Boot into App.
How to build a jump table for Boot.
Avoid constructs that will crash when App tries to use code from Boot, like: Using indirect calls or jumps, using static constructors or using static storage in Boot.
App uses Addresses of boot.elf directly
Linking with -Wl,-R,boot.elf
A simple way would be to just link app.elf against boot.elf be means of -Wl,-R,boot.elf. Option -R instructs the linker to use symbol values from the specified file without dragging any code. Problem is that there's no way to specify which symbols to use, for example this might lead to a situation where App uses libgcc functions from Boot.
Defining Symbols by means of -Wl,--defsym,symbol=value
A bit more control over which symbols are being defined can be implemented by following a specific naming convention. Suppose that all symbols from Boot that have "boot" in their name should be "exported", then you could just
> avr-nm -g boot.elf | grep ' T ' | awk '/boot/ { printf("--defsym %s=0x%s\n",$3,$1) }' > syms.opt
This prints global symbol values, and grep filters out symbols in the text section. awk then transforms lines like 00020102 T boot1 to lines like
--defsym boot1=0x00020102 which are written to an option file syms.opt. The option file can then be provided to the linker by means of -Wl,#syms.opt.
The advantage of an option file is that it is easier to provide than plain options in a build environment like make: app.elf would depend (amongst others) on syms.opt, which in turn would depend on boot.elf.
Defining Symbols in a Linker Script Snippet
An alternative would be to define the symbols in a linker script augmentation, which you would provide by means of -T syms.ld during link and which would contain
"boot1"=ABSOLUTE(0x00020102);
"boot2"=...
...
INSERT AFTER .text
Defining Symbols in an Assembly Module
Yet another way to define the symbols would be by means of an assembly module which contains definitions like .global boot1 together with boot1 = 0x00020102.
All these approaches have in common that all symbols must be defined, or otherwise the linker will throw an undefined symbol error. This means boot.elf must be available, and it does not matter whether just one symbol is undefined or whether dozends of symbols are undefined.
Let Boot provide a Dispatch Table
The problem with using boot.elf directly, like lined out in the previous section, is that it introduces a direct dependency. This means that if Boot is improved or refactored, then you'll also have to re-compile App each time, even if the interface did not change.
A solution is to let Boot provide a dispatch table whose position and layout are known ahead of time. Only when the interface itself changes, App will have to be rebuilt. Just refactoring Boot will not require to re-build App.
The Assembly Module with the Jump Table
As explained in the "Crash" section below, addresses in a dispatch table (and hence indirect jumps) won't work because EIND has a wrong value. Therefore, let's assume we have a table of JMPs to the desired Boot functions, like in an assembly module boot-table.sx that reads:
;;; Linker description file boot.ld locates input section .boot.table
;;; right after .vectors, hence the address of .boot_table will be
;;; text-section-start + _VECTORS_SIZE, where the latter is
;;; #define'd in <avr/io.h>.
;;; No "x" section flag so that the linker won't relax JMPs to RJMPs.
.section .boot.table,"a",#progbits
.global .boot_table
.type .boot_table,#object
boot_table:
jmp boot1
jmp boot2
.size boot_table, .-boot_table
In this example, we are going to locate the jump table right after .vectors, so that its location is known ahead of time. The respective symbol definitions in App's syms.opt will then read
--defsym boot1=0x20000+vectors_size+0*4
--defsym boot2=0x20000+vectors_size+1*4
provided Boot is located at 0x20000. Symbol vectors_size can be defined in a C/C++ module, here by abusing avr-gcc attribute "address":
#include <avr/io.h>
__attribute__((__address__(_VECTORS_SIZE)))
char vectors_size;
Locating the Jump Table
In order to locate input section .boot.table, we need an own linker description file, which you might already use for Boot anyways. We start with a linker script from avr-gcc installation at ./avr/lib/ldscripts/avr6.xn, copy it to boot.ld, and add the following 2 lines after vectors:
...
.text :
{
*(.vectors)
KEEP(*(.vectors))
*(.boot.table)
KEEP(*(.boot.table))
/* For data that needs to reside in the lower 64k of progmem. */
*(.progmem.gcc*)
...
Auto-Generating Boot's Jump Table Module and the Symbols for App
It's highly advisable to have an interface description used by both App and Boot, say common.h. Moreover, in order to keep Boot's boot-table.sx and App's syms.opt in sync with the interface, it's agood idea to auto-generate these two files from common.h. To that end, assume that common.h reads:
#ifndef COMMON_H
#define COMMON_H
#define EX __attribute__((__used__,__externally_visible__))
EX int boot1 /* #boot_table:0 */ (int);
EX int boot2 /* #boot_table:1 */ (void);
#endif /* COMMON_H */
For the matter of simplicity, let's assume that this is C code or the interfaces are extern "C" so that the symbols in source code match the assembly names, and there's no need to use mangled names. It' easy enough to generate boot-table.sx and syms.opt from common.h using the magic comments. The magic comment follows directly after the symbol, so a regex would retrieve the token left of the magic comment, something like Python:
# ... symbol /* #boot_table:index */...
pat = re.compile (r".*(\b\w+\b)\s*/\* #boot_table:(\d+) \*/.*")
for line in sys.stdin.readlines():
match = re.match (pat, line)
if match:
index = int (match.group(2))
symbol = match.group(1)
Output template for syms.opt would be something like:
asm_line = "--defsym {symbol}=0x20000+vectors_size+4*{index}\n"
Code that will crash
Using Boot code from App is subject to several restrictions:
Indirect Calls and Jumps
These will crash because the start addresses of App resp. Boot are in different 128KiB segments of flash. When the address of a code symbol is taken, the compiler does this per gs(symbol) which instructs the linker to generate a stub and resolve gs() to that stub in .trampolines if the target address is outside the 128KiB segment where the trampolines are located. An explanation of gs() can be found in this answer, there is however more to it: The startup code will effectively initialize
EIND = __vectors >> 17;
see gcrt1.S, the AVR-LibC bits of start-up code crt<device>.o. The compiler assumes EIND never changes during execution, see EIND and more than 128KiB of Flash in the GCC documentation.
This means code in Boot assumes EIND = 1 but is called with EIND = 0 and hence EICALL resp. EIJMP will target the wrong address. This means common code must avoid indirect calls and jumps, and should be compiled with -fno-jump-tables so that switch/case won't generate such tables.
This also implies that the dispatch table described above won't work if it would just held gs(symbol) entries, because App and Boot will disagree on EIND.
Data in Static Storage
If common Boot code is using data in static storage, the data might collide with App's static storage. One way out is to avoid static storage in respective parts of Boot and pass addresses to, say, some data buffer by means of pointer erguments of respective functions.
One could have completely separate RAM areas; one for Boot and one for App, but that would be a waste of RAM because the applications will never run at the same time.
Static Constructors
Boot's static constructors will be bypassed if App uses code from Boot. This includes:
C++ code in Boot that explicitly or implicitly generates such constructors.
C/C++ code in Boot that relies on __attribute__((__constructor__)) or code in section .initN which is supposed to run prior to main.
Start-up code that initializes static storage, EIND etc., which is also run by locating it in some .initN sections, but will be bypassed if App calls Boot code.

How does it work and compile a C++ extension of TCL with a Macro and no main function

I have a working set of TCL script plus C++ extension but I dont know exactly how it works and how was it compiled. I am using gcc and linux Arch.
It works as follows: when we execute the test.tcl script it will pass some values to an object of a class defined into the C++ extension. Using these values the extension using a macro give some result and print some graphics.
In the test.tcl scrip I have:
#!object
use_namespace myClass
proc simulate {} {
uplevel #0 {
set running 1
for {} {$running} { } {
moveBugs
draw .world.canvas
.statusbar configure -text "t:[tstep]"
}
}
}
set toroidal 1
set nx 100
set ny 100
set mv_dist 4
setup $nx $ny $mv_dist $toroidal
addBugs 100
# size of a grid cell in pixels
set scale 5
myClass.scale 5
The object.cc looks like:
#include //some includes here
MyClass myClass;
make_model(myClass); // --> this is a macro!
The Macro "make_model(myClass)" expands as follows:
namespace myClass_ns { DEFINE_MYLIB_LIBRARY; int TCL_obj_myClass
(mylib::TCL_obj_init(myClass),TCL_obj(mylib::null_TCL_obj,
(std::string)"myClass",myClass),1); };
The Class definition is:
class MyClass:
{
public:
int tstep; //timestep - updated each time moveBugs is called
int scale; //no. pixels used to represent bugs
void setup(TCL_args args) {
int nx=args, ny=args, moveDistance=args;
bool toroidal=args;
Space::setup(nx,ny,moveDistance,toroidal);
}
The whole thing creates a cell-grid with some dots (bugs) moving from one cell to another.
My questions are:
How do the class methods and variables get the script values?
How is possible to have c++ code and compile it without a main function?
What is that macro doing there in the extension and how it works??
Thanks
Whenever a command in Tcl is run, it calls a function that implements that command. That function is written in a language like C or C++, and it is passed in the arguments (either as strings or Tcl_Obj* values). A full extension will also include a function to do the library initialisation; the function (which is external, has C linkage, and which has a name like Foo_Init if your library is foo.dll) does basic setting up tasks like registering the implementation functions as commands, and it's explicit because it takes a reference to the interpreter context that is being initialised.
The implementation functions can do pretty much anything they want, but to return a result they use one of the functions Tcl_SetResult, Tcl_SetObjResult, etc. and they have to return an int containing the relevant exception code. The usual useful ones are TCL_OK (for no exception) and TCL_ERROR (for stuff's gone wrong). This is a C API, so C++ exceptions aren't allowed.
It's possible to use C++ instance methods as command implementations, provided there's a binding function in between. In particular, the function has to get the instance pointer by casting a ClientData value (an alias for void* in reality, remember this is mostly a C API) and then invoking the method on that. It's a small amount of code.
Compiling things is just building a DLL that links against the right library (or libraries, as required). While extensions are usually recommended to link against the stub library, it's not necessary when you're just developing and testing on one machine. But if you're linking against the Tcl DLL, you'd better make sure that the code gets loaded into a tclsh that uses that DLL. Stub libraries get rid of that tight binding, providing pretty strong ABI stability, but are little more work to set up; you need to define the right C macro to turn them on and you need to do an extra API call in your initialisation function.
I assume you already know how to compile and link C++ code. I won't tell you how to do it, but there's bound to be other questions here on Stack Overflow if you need assistance.
Using the code? For an extension, it's basically just:
# Dynamically load the DLL and call the init function
load /path/to/your.dll
# Commands are all present, so use them
NewCommand 3
There are some extra steps later on to turn a DLL into a proper Tcl package, abstracting code that uses the DLL away from the fact that it is exactly that DLL and so on, but they're not something to worry about until you've got things working a lot more.

in c++ main function is the entry point to program how i can change it to an other function?

I was asked an interview question to change the entry point of a C or C++ program from main() to any other function. How is it possible?
In standard C (and, I believe, C++ as well), you can't, at least not for a hosted environment (but see below). The standard specifies that the starting point for the C code is main. The standard (c99) doesn't leave much scope for argument:
5.1.2.2.1 Program startup: (1) The function called at program startup is named main.
That's it. It then waffles on a bit about parameters and return values but there's really no leeway there for changing the name.
That's for a hosted environment. The standard also allows for a freestanding environment (i.e., no OS, for things like embedded systems). For a freestanding environment:
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined.
You can use "trickery" in C implementations so that you can make it look like main isn't the entry point. This is in fact what early Windows compliers did to mark WinMain as the start point.
First way: a linker may include some pre-main startup code in a file like start.o and it is this piece of code which runs to set up the C environment then call main. There's nothing to stop you replacing that with something that calls bob instead.
Second way: some linkers provide that very option with a command-line switch so that you can change it without recompiling the startup code.
Third way: you can link with this piece of code:
int main (int c, char *v[]) { return bob (c, v); }
and then your entry point for your code is seemingly bob rather than main.
However, all this, while of possibly academic interest, doesn't change the fact that I can't think of one single solitary situation in my many decades of cutting code, where this would be either necessary or desirable.
I would be asking the interviewer: why would you want to do this?
The entry point is actually the _start function (implemented in crt1.o) .
The _start function prepares the command line arguments and then calls main(int argc,char* argv[], char* env[]),
you can change the entry point from _start to mystart by setting a linker parameter:
g++ file.o -Wl,-emystart -o runme
Of course, this is a replacement for the entry point _start so you won't get the command line arguments:
void mystart(){
}
Note that global/static variables that have constructors or destructors must be initialized at the beginning of the application and destroyed at the end. Keep that in mind if you are planning on bypassing the default entry point which does it automatically.
From C++ standard docs 3.6.1 Main Function,
A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined
whether a program in a freestanding environment is required to define a main function.
So, it does depend on your compiler/linker...
If you are on VS2010, this could give you some idea
As it is easy to understand, this is not mandated by the C++ standard and falls in the domain of 'implemenation specific behavior'.
This is highly speculative, but you might have a static initializer instead of main:
#include <iostream>
int mymain()
{
std::cout << "mymain";
exit(0);
}
static int sRetVal = mymain();
int main()
{
std::cout << "never get here";
}
You might even make it 'Java-like', by putting the stuff in a constructor:
#include <iostream>
class MyApplication
{
public:
MyApplication()
{
std::cout << "mymain";
exit(0);
}
};
static MyApplication sMyApplication;
int main()
{
std::cout << "never get here";
}
Now. The interviewer might have thought about these, but I'd personally never use them. The reasons are:
It's non-conventional. People won't understand it, it's nontrivial to find the entry point.
Static initialization order is nondeterministic. Put in another static variable and you'll never now if it gets initialized.
That said, I've seen it being used in production instead of init() for library initializers. The caveat is, on windows, (from experience) your statics in a DLL might or might not get initialized based on usage.
Modify the crt object that actually calls the main() function, or provide your own (don't forget to disable linking of the normal one).
With gcc, declare the function with attribute((constructor)) and gcc will execute this function before any other code including main.
For Solaris Based Systems I have found this. You can use the .init section for every platforms I guess:
pragma init (function [, function]...)
Source:
This pragma causes each listed function to be called during initialization (before main) or during shared module loading, by adding a call to the .init section.
It's very simple:
As you should know when you use constants in c, the compiler execute a kind of 'macro' changing the name of the constant for the respective value.
just include a #define argument in the beginning of your code with the name of start-up function followed by the name main:
Example:
#define my_start-up_function (main)
I think it is easy to remove the undesired main() symbol from the object before linking.
Unfortunately the entry point option for g++ is not working for me(the binary crashes before entering the entry point). So I strip undesired entry-point from object file.
Suppose we have two sources that contain entry point function.
target.c contains the main() we do not want.
our_code.c contains the testmain() we want to be the entry point.
After compiling(g++ -c option) we can get the following object files.
target.o, that contains the main() we do not want.
our_code.o that contains the testmain() we want to be the entry point.
So we can use the objcopy to strip undesired main() function.
objcopy --strip-symbol=main target.o
We can redefine testmain() to main() using objcopy too.
objcopy --redefine-sym testmain=main our_code.o
And then we can link both of them into binary.
g++ target.o our_code.o -o our_binary.bin
This works for me. Now when we run our_binary.bin the entry point is our_code.o:main() symbol which refers to our_code.c::testmain() function.
On windows there is another (rather unorthodox) way to change the entry point of a program: TLS. See this for more explanations: http://isc.sans.edu/diary.html?storyid=6655
Yes,
We can change the main function name to any other name for eg. Start, bob, rem etc.
How does the compiler knows that it has to search for the main() in the entire code ?
Nothing is automatic in programming.
somebody has done some work to make it looks automatic for us.
so it has been defined in the start up file that the compiler should search for main().
we can change the name main to anything else eg. Bob and then the compiler will be searching for Bob() only.
Changing a value in Linker Settings will override the entry point. i.e., MFC applications use a value of 'Windows (/SUBSYSTEM:WINDOWS)' to change entry point from main() to CWinApp::WinMain().
Right clicking on solution > Properties > Linker > System > Subsystem > Windows (/SUBSYSTEM:WINDOWS)
...
Very practical benefit to modifying entry point:
MFC is a framework we take advantage of to write Windows applications in C++. I know it's ancient, but my company maintains one for legacy reasons! You will not find a main() in MFC code. MSDN says the entry point is WinMain(), instead. Thus, you can override the WinMain() of your base CWinApp object. Or, most people override CWinApp::InitInstance() because the base WinMain() will call it.
Disclaimer: I use empty parentheses to denote a method, without caring how many arguments.

How do I make an unreferenced object load in C++?

I have a .cpp file (let's call it statinit.cpp) compiled and linked into my executable using gcc.
My main() function is not in statinit.cpp.
statinit.cpp has some static initializations that I need running.
However, I never explicitly reference anything from statinit.cpp in my main(), or in anything referenced by it.
What happens (I suppose) is that the linked object created from statinit.cpp is never loaded on runtime, so my static initializations are never run, causing a problem elsewhere in the code (that was very difficult to debug, but I traced it eventually).
Is there a standard library function, linker option, compiler option, or something that can let me force that object to load on runtime without referencing one of its elements?
What I thought to do is to define a dummy function in statinit.cpp, declare it in a header file that main() sees, and call that dummy function from main(). However, this is a very ugly solution and I'd very much like to avoid making changes in statinit.cpp itself.
Thanks,
Daniel
It is not exactly clear what the problem is:
C++ does not have the concept of static initializers.
So one presume you have an object in "File Scope".
If this object is in the global namespace then it will be constructed before main() is called and destroyed after main() exits (assuming it is in the application).
If this object is in a namespace then optionally the implementation can opt to lazy initialize the variable. This just means that it will be fully initialized before first use. So if you are relying on a side affect from construction then put the object in the global namespace.
Now a reason you may not be seeing the constructor to this object execute is that it was not linked into the application. This is a linker issue and not a language issue. This happens when the object is compiled into a static library and your application is then linked against the static library. The linker will only load into the application functions/objects that are explicitly referenced from the application (ie things that resolve undefined things in the symbol table).
To solve this problem you have a couple of options.
Don't use static libraries.
Compile into dynamic libraries (the norm nowadays).
Compile all the source directly into the application.
Make an explicit reference to the object from within main.
I ran into the same problem.
Write a file, DoNotOptimizeAway.cpp:
void NoDeadcodeElimination()
{
// Here use at least once each of the variables that you'll need.
}
Then call NoDeadcodeElimination() from main.
EDIT: alternatively you can edit your linker options and tell it to always link everything, even if it's not used. I don't like this approach though since executables will get much bigger.
These problems, and the problems with these potential solutions all revolve around the fact that you can't guarantee much about static initialization. So since it's not dependable, don't depend on it!
Explicitly initialize data with a static "InititalizeLibrary" type static function. Now you guarantee it happens, and you guarantee when it happens in relation to other code based on when you make the call.
One C++'ish way to do this is with Singletons.
Essentially, write a function to return a reference to the object. To force it to initialize, make it a static object inside the function.
Make a class static function that is vaguely like this:
class MyClass {
static MyClass& getObject()
{
static MyObject obj;
return obj;
}
};
Since you are using C++, you could always declare a global object (ie a global variable that references a class in statinit.cpp. As always, the constructor will be called on initialization and since the object is global, this will be called before main() is run.
There is one very important caveat though. There is no guarantee as to when the constructor will be called and there is no way to explicitly order when each of these constructors is called. This will also probably defeat any attempt to check for memory leaks since you can no longer guarantee that all the memory allocated while running main has been deallocated.
Is the problem that the static items were never initialized, or is the problem that the static items weren't initialized when you needed to use them?
All static initialization is supposed to be finished before your main() runs. However, you can run into issues if you initialize on static object with another static object. (Note: this doesn't apply if you are using primitives such as int)
For example, if you have in file x.cpp:
static myClass x(someVals);
And in y.cpp:
static myClass y = x * 2;
It's possible that the system will try to instantiate y before x is created. In that case, the "y" variable will likely be 0, as x is likely 0 before it is initialized.
In general, the best solution for this is to instantiate the object when it is first used (if possible). However, I noticed above you weren't allowed to modify that file. Are the values from that file being used elsewhere, and maybe you can change how those values are accessed?
Read the manual page for the ld command and look at the -u option. If statinit.cpp defines anything that looks like a function then try naming it in -u. Otherwise choose a data object that's defined in statinit.cpp and name that in -u, and hope it works. I think it's best if you write the command line so the -u option comes immediately before your -l option for your library that has statinit's object code in it.
Of course the dynamic library solution is the best, but I've also been told it's possible to link the whole static library with the linker option:
-Wl,-whole-archive
before the library's -l option, and
-Wl,-no-whole-archive
after it (to avoid including other libraries as a whole, too).

Compiling a DLL with gcc

Sooooo I'm writing a script interpreter. And basically, I want some classes and functions stored in a DLL, but I want the DLL to look for functions within the programs that are linking to it, like,
program dll
----------------------------------------------------
send code to dll-----> parse code
|
v
code contains a function,
that isn't contained in the DLL
|
list of functions in <------/
program
|
v
corresponding function,
user-defined in the
program--process the
passed argument here
|
\--------------> return value sent back
to the parsing function
I was wondering basically, how do I compile a DLL with gcc? Well, I'm using a windows port of gcc. Once I compile a .dll containing my classes and functions, how do I link to it with my program? How do I use the classes and functions in the DLL? Can the DLL call functions from the program linking to it? If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program? Thanks in advance, I really need to know how to work with DLLs in C++ before I can continue with this project.
"Can you add more detail as to why you want the DLL to call functions in the main program?"
I thought the diagram sort of explained it... the program using the DLL passes a piece of code to the DLL, which parses the code, and if function calls are found in said code then corresponding functions within the DLL are called... for example, if I passed "a = sqrt(100)" then the DLL parser function would find the function call to sqrt(), and within the DLL would be a corresponding sqrt() function which would calculate the square root of the argument passed to it, and then it would take the return value from that function and put it into variable a... just like any other program, but if a corresponding handler for the sqrt() function isn't found within the DLL (there would be a list of natively supported functions) then it would call a similar function which would reside within the program using the DLL to see if there are any user-defined functions by that name.
So, say you loaded the DLL into the program giving your program the ability to interpret scripts of this particular language, the program could call the DLLs to process single lines of code or hand it filenames of scripts to process... but if you want to add a command into the script which suits the purpose of your program, you could say set a boolean value in the DLL telling it that you are adding functions to its language and then create a function in your code which would list the functions you are adding (the DLL would call it with the name of the function it wants, if that function is a user-defined one contained within your code, the function would call the corresponding function with the argument passed to it by the DLL, the return the return value of the user-defined function back to the DLL, and if it didn't exist, it would return an error code or NULL or something). I'm starting to see that I'll have to find another way around this to make the function calls go one way only
This link explains how to do it in a basic way.
In a big picture view, when you make a dll, you are making a library which is loaded at runtime. It contains a number of symbols which are exported. These symbols are typically references to methods or functions, plus compiler/linker goo.
When you normally build a static library, there is a minimum of goo and the linker pulls in the code it needs and repackages it for you in your executable.
In a dll, you actually get two end products (three really- just wait): a dll and a stub library. The stub is a static library that looks exactly like your regular static library, except that instead of executing your code, each stub is typically a jump instruction to a common routine. The common routine loads your dll, gets the address of the routine that you want to call, then patches up the original jump instruction to go there so when you call it again, you end up in your dll.
The third end product is usually a header file that tells you all about the data types in your library.
So your steps are: create your headers and code, build a dll, build a stub library from the headers/code/some list of exported functions. End code will link to the stub library which will load up the dll and fix up the jump table.
Compiler/linker goo includes things like making sure the runtime libraries are where they're needed, making sure that static constructors are executed, making sure that static destructors are registered for later execution, etc, etc, etc.
Now as to your main problem: how do I write extensible code in a dll? There are a number of possible ways - a typical way is to define a pure abstract class (aka interface) that defines a behavior and either pass that in to a processing routine or to create a routine for registering interfaces to do work, then the processing routine asks the registrar for an object to handle a piece of work for it.
On the detail of what you plan to solve, perhaps you should look at an extendible parser like lua instead of building your own.
To your more specific focus.
A DLL is (typically?) meant to be complete in and of itself, or explicitly know what other libraries to use to complete itself.
What I mean by that is, you cannot have a method implicitly provided by the calling application to complete the DLLs functionality.
You could however make part of your API the provision of methods from a calling app, thus making the DLL fully contained and the passing of knowledge explicit.
How do I use the classes and functions in the DLL?
Include the headers in your code, when the module (exe or another dll) is linked the dlls are checked for completness.
Can the DLL call functions from the program linking to it?
Yes, but it has to be told about them at run time.
If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program?
Yes it will be available, however there are some restrictions you need to be aware about. Such as in the area of memory management it is important to either:
Link all modules sharing memory with the same memory management dll (typically c runtime)
Ensure that the memory is allocated and dealloccated only in the same module.
allocate on the stack
Examples!
Here is a basic idea of passing functions to the dll, however in your case may not be most helpfull as you need to know up front what other functions you want provided.
// parser.h
struct functions {
void *fred (int );
};
parse( string, functions );
// program.cpp
parse( "a = sqrt(); fred(a);", functions );
What you need is a way of registering functions(and their details with the dll.)
The bigger problem here is the details bit. But skipping over that you might do something like wxWidgets does with class registration. When method_fred is contructed by your app it will call the constructor and register with the dll through usage off methodInfo. Parser can lookup methodInfo for methods available.
// parser.h
class method_base { };
class methodInfo {
static void register(factory);
static map<string,factory> m_methods;
}
// program.cpp
class method_fred : public method_base {
static method* factory(string args);
static methodInfo _methoinfo;
}
methodInfo method_fred::_methoinfo("fred",method_fred::factory);
This sounds like a job for data structures.
Create a struct containing your keywords and the function associated with each one.
struct keyword {
const char *keyword;
int (*f)(int arg);
};
struct keyword keywords[max_keywords] = {
"db_connect", &db_connect,
}
Then write a function in your DLL that you pass the address of this array to:
plugin_register(keywords);
Then inside the DLL it can do:
keywords[0].f = &plugin_db_connect;
With this method, the code to handle script keywords remains in the main program while the DLL manipulates the data structures to get its own functions called.
Taking it to C++, make the struct a class instead that contains a std::vector or std::map or whatever of keywords and some functions to manipulate them.
Winrawr, before you go on, read this first:
Any improvements on the GCC/Windows DLLs/C++ STL front?
Basically, you may run into problems when passing STL strings around your DLLs, and you may also have trouble with exceptions flying across DLL boundaries, although it's not something I have experienced (yet).
You could always load the dll at runtime with load library