pass parameters to dll? - c++

I'm new to c++ and Access. I'm working with a project calls dll (compiled by c++) from Access.
I want to understand how are the parameters passed into the dll.
The input data for dll is prepared in Access, and we call the dll from Access.
We associate "RunFunction" with the dll we want to call.
The line in Access calls the dll:
Results = RunFunction(Data.age, Data.calendar, Data.timesheet, Data.extra)
The cpp code that complies the dll:
double __stdcall RunFunction(double * iData, double(*iCalendar)[100], double(*iTimesheet)[100])
First question, from the cpp code, I found the *iData(in c++) actually contains all info from Data (in Access).
Why it could happen? I thought only Data.age is passed into *iData, not the whole Data?
Second question, the the RunFunction from Access has four input parameters, while c++ only takes three, why it doesn't this cause any issue?

First, consider that inside Access the value of Data.age might be inside a buffer containing the entire record or some other structure. So when the address of that one value is passed to you in C++, you can explore neighboring addresses and see what’s in it. Don’t do that!
Second, look at the way __stdcall works. It was designed in the early days of C when function arguments were not checked at all! You can pass fewer or more parameters on the caller side and not mess up the stack. If you pass extra, no big deal. If you leave off some, then using the rightmost names in the function will give garbage values and witing to them can cause all sorts of problems including clobbering the return address.

Related

How to safely return objects from DLL calls

I am fairly new to C++ and working with DLLs. I have a main application that aggregates results from different measurements. As the measurements are different from case to case I decided to put them into external DLLs so they can be loaded at runtime (they simply all export the same function). The idea is to just load them like this so the aggregator can be extended depending on the runtime needs:
typedef int (*measure)(measurement &dataHolder);
int callM() {
[...]
measurement dataHolder;
lib = LoadLibraryA("measureDeviceTypeA.dll");
measure measureFunc = (measure)GetProcAddress(lib, "measureFunc");
pluginFunc(dataHolder);
[...] // close the lib and load the next one depending on found Devices
}
This works pretty well for simple datatypes (depending on the actual definition of the struct "measurement") such as this:
typedef struct measurement {
DWORD realPBS;
DWORD imaginaryPBS;
int a;
} measurement;
Now there also may be a string of arbitrary length (char representations of results). I would like to put them into the measurement struct as well and fill them inside the actual worker function inside the DLL. My first assumption was that it would be easy to just use std::string, which works sometimes and sometimes not (as it will reallocate memory on std::string().append() and this might break (access violation) depending on the actual runtime environment of the program and the dll). I read here and here that returning a string from a function is a bad idea.
So what would be the "proper" C++ way of returning arbitrary length strings from such a call? Is it helpful at all to pass a struct to the DLL or should I split it into separate calls? I don't want to have pointers dangling around or unfreed memory when I close the DLL again.
This won't work with std::string, as noted by Dani in the comments. The problem is that std::string is a type that belongs to your implementation, and different C++ implementations have different std::strings.
For DLL's specifically (Microsoft), you do have another alternative. COM is an ancient technology, but it still works today and is unlikely to go away ever. And it has its own string type, BSTR. Visual Studio provides a helper C++ class bstr_t for your own code, but on the interface you'd use the plain BSTR from _bstr_t::GetBSTR.
BSTR relies on the Windows allocator SysAllocString from OleAut32.dll
The problem is, that the string data is often allocated on the heap, so it has to be freed / managed somehow.
You could think, hey std::string is returned by value - so why I need to care about memory management. The problem is that usually only very small strings are stored "inside" the class. For larger strings the string class contains a pointer to some "heap-storage".
Dlls can be used from and with different programming languages - which is the reason that dlls do not share a "memory manager", freeing in the dll would fail.
To solve this you need to have two function calls, one which returns a pointer / handle to the data and one to free it. Or the caller could give the callee some pointer where it wants the data to be stored. You need for that a maximum-byte-count, too.
As you can see, there are some reasons why you should avoid these APIs - but it is not always possible. See for example the Windows API (there you can find both approaches).
Another approach would be to ensure a shared memory manager, but this is tricky somehow because it must be done really early!

is it possible to use function pointers this way?

This is something that recently crossed my mind, quoting from wikipedia: "To initialize a function pointer, you must give it the address of a function in your program."
So, I can't make it point to an arbitrary memory address but what if i overwrite the memory at the address of the function with a piece of data the same size as before and than invoke it via pointer ? If such data corresponds to an actual function and the two functions have matching signatures the latter should be invoked instead of the first.
Is it theoretically possible ?
I apologize if this is impossible due to some very obvious reason that i should be aware of.
If you're writing something like a JIT, which generates native code on the fly, then yes you could do all of those things.
However, in order to generate native code you obviously need to know some implementation details of the system you're on, including how its function pointers work and what special measures need to be taken for executable code. For one example, on some systems after modifying memory containing code you need to flush the instruction cache before you can safely execute the new code. You can't do any of this portably using standard C or C++.
You might find when you come to overwrite the function, that you can only do it for functions that your program generated at runtime. Functions that are part of the running executable are liable to be marked write-protected by the OS.
The issue you may run into is the Data Execution Prevention. It tries to keep you from executing data as code or allowing code to be written to like data. You can turn it off on Windows. Some compilers/oses may also place code into const-like sections of memory that the OS/hardware protect. The standard says nothing about what should or should not work when you write an array of bytes to a memory location and then call a function that includes jmping to that location. It's all dependent on your hardware and your OS.
While the standard does not provide any guarantees as of what would happen if you make a function pointer that does not refer to a function, in real life and in your particular implementation and knowing the platform you may be able to do that with raw data.
I have seen example programs that created a char array with the appropriate binary code and have it execute by doing careful casting of pointers. So in practice, and in a non-portable way you can achieve that behavior.
It is possible, with caveats given in other answers. You definitely do not want to overwrite memory at some existing function's address with custom code, though. Not only is typically executable memory not writeable, but you have no guarantees as to how the compiler might have used that code. For all you know, the code may be shared by many functions that you think you're not modifying.
So, what you need to do is:
Allocate one or more memory pages from the system.
Write your custom machine code into them.
Mark the pages as non-writable and executable.
Run the code, and there's two ways of doing it:
Cast the address of the pages you got in #1 to a function pointer, and call the pointer.
Execute the code in another thread. You're passing the pointer to code directly to a system API or framework function that starts the thread.
Your question is confusingly worded.
You can reassign function pointers and you can assign them to null. Same with member pointers. Unless you declare them const, you can reassign them and yes the new function will be called instead. You can also assign them to null. The signatures must match exactly. Use std::function instead.
You cannot "overwrite the memory at the address of a function". You probably can indeed do it some way, but just do not. You're writing into your program code and are likely to screw it up badly.

C++/Win32 Dynamically calling a function without knowing its signature

This is a terrible idea, but i'm seeing if it's even feasible before I walk down this road.
I have to write a Win32 C++ program that can dynamically load a library based on a file that has serialized information on what dll, function, signature, and arguements to use. Loading the library is trivial (LoadLibraryEx works fine). Then getting the function pointer is easy (not a big deal, the GetProcAdderss takes care of this). However the rest is tricky.
Here's my plan of attack, feel free to let me know if this isn't the best approach:
Open the serialized information from a file on what DLL to load, and what function to execute.
LoadLibraryEx to bring in the DLL
GetProcAddress to get the function pointer (after casting the byte array to a string)
Write the arguments (which are read in as a byte array) to memory in bytes.
Get the address to the beginning of each argument (i'll know from serialization what the size of each argument is).
Using assembly jump to the beginning of the function pointer, push the addresses on the heap to the arguments in the stack (in reverse order).
Execute and get back the return value address (as a void * ?)
Use the memory address of the return value (that I got from assembly) and the size (which I got from the serialization) of the return type value and write the raw bytes back to a file.
Keep in mind my limitations:
I will never know except for run-time what the signature, dll, function name is.
It is always read in from a file.
Is there a better approach, will this approach even work?
Update
For anyone who comes poking in this thread to learn more, I found a solution. In C you can dynamically load a library using dlopen (there's a winlib of this for ease of use). Once loaded you can dynamically execute functions using libffi (supports mac/ios/win 64 and 32bit). This only gets you to C functions and primitive types (pointer,uint,int,double,float) and thats about it. However using macosx objective-c bridge you can access objective-c by loading libobjc (osx's native obj-c to c "toll free" bridge). Then through that dynamically create obj-c and c++ classes. A similar technique can be done on windows using C# and its marshaling capabilities.
This ends up with HIGH overhead, and you must be VERY careful about your memory, in addition don't mix pointers from C/C#/C++. Finally, whatever you do, at runtime. BE ABSOLUTELY SURE YOU KNOW YOUR TYPES.... seriously. BTW, libffi/cinvoke, amazing libraries.
There are existing libraries that can do what you describe, such as C/Invoke:
http://www.nongnu.org/cinvoke/
General rule, that if you have a terrible idea, drop it and find a good one.
If the signature is not known at all, what you describe will fall on face. Suppose your call works for my function as it is. I change the function from __stdcall to __cdecl or back, and you will crash.
Also you don't handle the return.
If you relax the "unknown" to allow some limitations, like fixing a return type and calling convention, you are somewhat ahead -- then really you can emulate the call but what is it good for? This whole thing sounds like a self-service hack-me engine.
The normal way is to publish some fixed interface (using function signatures), and make the DLL support it. Or arrange some uniform data transfer mechanism.
I'd suggest you to describe what you're after really and post ask that, maybe on Programmers SE.

C++ DLL in VB crashing issue

I have a dll that someone made me in C++. I needed to use this dll in VB, in order to do that I had to make another dll in C++ that has functions I can call in VB.
The C++ dll I made has 4 functions. 2 callback functions that retrieve information from the original C++. And 2 functions that I can call from VB to send that information.
I know the original dll works fine as Ive tested it endlessly in a console app.
However when I use it with my dll and VB.. I get random crashes.
There is almost no code in my VB app as its just for testing. It just outputs the information so theres no problem there.
I believe the problem is in the C++ dll I made. I am pretty new with C++.
I think maybe a variable gets accessed in 2spots at the same time (is this possible?) and causes it to crash?
Heres the basic layout of my C++ dll
//global variables
CString allInfo="";
char* info=new char[25000];
//call back function 1
HANDLE OnInfo(SendInfo* tempInfo)
{
CString stringTemp="";
stringTemp=tempInfo->infomessage;
allInfo=allInfo+ stringTemp+"\n";
return 0;
}
//function for vb
BSTR _stdcall vbInfo()
{
allInfo=allInfo.Right(20000); //get last 20,000 characters
strcpy_s(info,20000,allInfo);
BSTR Message;
Message = SysAllocStringByteLen (info, lstrlen(info));
return Message;
}
Crash seems to happen completely randomly.
Any suggestions?
Thanks
Aside from learning that Googling for the CString class reference returns some ahem interesting results, your problem is probably going to be the multiple access of CString.
You didn't post a lot of info, so I'm going to assume the OnInfo method is a callback function that is called by a thread of execution different from the one that calls vbInfo. In this case, you want to look at the CString::operator=() method description on MSDN:
The CString assignment (=) operator reinitializes an existing CString
object with new data. If the destination string (that is, the left
side) is already large enough to store the new data, no new memory
allocation is performed. You should be aware that memory exceptions
may occur whenever you use the assignment operator because new storage
is often allocated to hold the resulting CString object.
Given that there does not appear to be a limit on the size of what you put into the CString, it may be deallocating and allocating the memory in allInfo in one function while you're reading or writing it in another functions. Things don't go so well when you suddenly try to write to unallocated memory.
You might want to look at something like a Critical Section or a mutex to keep both of your functions from hosing the common memory buffer.
You didn't say if your dll is compiled to use Unicode or ANSI strings. You didn't say if the dll that the other person supplied to you is compiled to use Unicode or ANSI strings. The VB caller probably gives you Unicode strings, but it is possible to make the VB caller give you ANSI strings. So we see your code with CString of unknown type, char* pointing to an ANSI string, BSTR pointing to a Unicode string but size allocated in bytes, and who knows what.
There are a lot of articles explaining how to use Unicode, but they'd be a bit too heavy for someone who is pretty new to C++.
It would really be best if you go back to the person who made the other dll for you, and ask that person to add features that you need. Also mention to them that you'll be calling the dll from VB, so you need their dll to handle Unicode strings.

C++ Storing objects in a file

I have a list of objects that I would like to store in a file as small as possible for later retrieval. I have been carefully reading this tutorial, and am beginning (I think) to understand, but have several questions. Here is the snippet I am working with:
static bool writeHistory(string fileName)
{
fstream historyFile;
historyFile.open(fileName.c_str(), ios::binary);
if (historyFile.good())
{
list<Referral>::iterator i;
for(i = AllReferrals.begin();
i != AllReferrals.end();
i++)
{
historyFile.write((char*)&(*i),sizeof(Referral));
}
return true;
} else return false;
}
Now, this is adapted from the snippet
file.write((char*)&object,sizeof(className));
taken from the tutorial. Now what I believe it is doing is converting the object to a pointer, taking the value and size and writing that to the file. But if it is doing this, why bother doing the conversions at all? Why not take the value from the beginning? And why does it need the size? Furthermore, from my understanding then, why does
historyFile.write((char*)i,sizeof(Referral));
not compile? i is an iterator (and isn't an iterator a pointer?). or simply
historyFile.write(i,sizeof(Referral));
Why do i need to be messing around with addresses anyway? Aren't I storing the data in the file? If the addresses/values are persisting on their own, why can't i just store the addresses deliminated in plain text and than take their values later?
And should I still be using the .txt extension? < edit> what should I use instead then? I tried .dtb and was not able to create the file. < /edit> I actually can't even seem to get file to open without errors with the ios::binary flag. I'm also having trouble passing the filename (as a string class string, converted back by c_str(), it compiles but gives an error).
Sorry for so many little questions, but it all basically sums up to how to efficiently store objects in a file?
What you are trying to do is called serialization. Boost has a very good library for doing this.
What you are trying to do can work, in some cases, with some very important conditions. It will only work for POD types. It is only guaranteed to work for code compiled with the same version of the compiler, and with the same arguments.
(char*)&(*i)
says to take the iterator i, dereference it to get your object, take the address of it and treat it as an array of characters. This is the start of what is being written to the file. sizeof(Referral) is the number of bytes that will be written out.
An no, an iterator is not necessarily a pointer, although pointers meet all the requirements for an iterator.
Question #1 why does ... not compile?
Answer: Because i is not a Referral* -- it's a list::iterator ;; an iterator is an abstraction over a pointer, but it's not a pointer.
Question #2 should I still be using the .txt extension?
Answer: probably not. .txt is associated by many systems to the MIME type text/plain.
Unasked Question: does this work?
Answer: if a Referral has any pointers on it, NO. When you try to read the Referrals from the file, the pointers will be pointing to the location on memory where something used to live, but there is no guarantee that there is anything valid there anymore, least of all the thing that the pointers were pointing to originally. Be careful.
isn't an iterator a pointer?
An iterator is something that acts like a pointer from the outside. In most (perhaps all) cases, it is actually some form of object instead of a bare pointer. An iterator might contain a pointer as an internal member variable that it uses to perform its job, but it just as well might contain something else or additional variables if necessary.
Furthermore, even if an iterator has a simple pointer inside of it, it might not point directly at the object you're interested in. It might point to some kind of bookkeeping component used by the container class which it can then use to get the actual object of interest. Fortunately, we don't need to care what those internal details actually are.
So with that in mind, here's what's going on in (char*)&(*i).
*i returns a reference to the object stored in the list.
& takes the address of that object, thus yielding a pointer to the object.
(char*) casts that object pointer into a char pointer.
That snippet of code would be the short form of doing something like this:
Referral& r = *i;
Referral* pr = &r;
char* pc = (char*)pr;
Why do i need to be messing around
with addresses anyway?
And why does it need the size?
fstream::write is designed to write a series of bytes to a file. It doesn't know anything about what those bytes mean. You give it an address so that it can write the bytes that exist starting wherever that address points to. You give it a size so that it knows how many bytes to write.
So if I do:
MyClass ExampleObject;
file.write((char*)ExampleObject, sizeof(ExampleObject));
Then it writes all the bytes that exist directly within ExampleObject to the file.
Note: As others have mentioned, if the object you want to write has members that dynamically allocate memory or otherwise make use of pointers, then the pointed to memory will not be written by a single simple fstream::write call.
will serialization give a significant boost in storage efficiency?
In theory, binary data can often be both smaller than plain-text and faster to read and write. In practice, unless you're dealing with very large amounts of data, you'll probably never notice the difference. Hard drives are large and processors are fast these days.
And efficiency isn't the only thing to consider:
Binary data is harder to examine, debug, and modify if necessary. At least without additional tools, but even then plain-text is still usually easier.
If your data files are going to persist between different versions of your program, then what happens if you need to change the layout of your objects? It can be irritating to write code so that a version 2 program can read objects in a version 1 file. Furthermore, unless you take action ahead of time (like by writing a version number in to the file) then a version 1 program reading a version 2 file is likely to have serious problems.
Will you ever need to validate the data? For instance, against corruption or against malicious changes. In a binary scheme like this, you'd need to write extra code. Whereas when using plain-text the conversion routines can often help fill the roll of validation.
Of course, a good serialization library can help out with some of these issues. And so could a good plain-text format library (for instance, a library for XML). If you're still learning, then I'd suggest trying out both ways to get a feel for how they work and what might do best for your purposes.
What you are trying to do (reading and writing raw memory to/from file) will invoke undefined behaviour, will break for anything that isn't a plain-old-data type, and the files that are generated will be platform dependent, compiler dependent and probably even dependent on compiler settings.
C++ doesn't have any built-in way of serializing complex data. However, there are libraries that you might find useful. For example:
http://www.boost.org/doc/libs/1_40_0/libs/serialization/doc/index.html
Did you have already a look at boost::serialization, it is robust, has a good documentation, supports versioning and if you want to switch to an XML format instead of a binary one, it'll be easier.
Fstream.write simply writes raw data to a file. The first parameter is a pointer to the starting address of the data. The second parameter is the length (in bytes) of the object, so write knows how many bytes to write.
file.write((char*)&object,sizeof(className));
^
This line is converting the address of object to a char pointer.
historyFile.write((char*)i,sizeof(Referral));
^
This line is trying to convert an object (i) into a char pointer (not valid)
historyFile.write(i,sizeof(Referral));
^
This line is passing write an object, when it expects a char pointer.