I'm designing a JNI interface that passes string parameters from Java to C++. I need high performance and have been able to use Direct ByteBuffer and String.getBytes() to do that fairly well, but the penalty for passing strings to C/C++ still remains fairly high. I recently read about the Open JDK's Unsafe class. This excellent page got me started, but I'm finding Unsafe to be woefully, but understandably poorly documented.
I'm wondering, if I use the Unsafe class to obtain a pointer to a string and pass it to C++, is there a risk that the object has moved before the C++ code is entered? And even while C++ is executing? Or are these addresses provided by the Unsafe code somehow pinned? If they aren't pinned, how are these Unsafe pointers ever useful?
Unsafe is not meant to interop with JNI. So obtained via Unsafe could change any time (even in parallel with your C++).
JNI API has ability to pin object in memory to access array content (in HotSpot JVM it would block GC thus may have negative effect on GC pause duration).
In particular, Get*ArrayElements would pin array until you explicitly do Release*ArrayElements. GetStringChars work similar way.
Direct ByteBuffer hold pointer to memory buffer outside of heap, hense this buffer is not moving and you can access it for Native code.
I've read the Java source for java.misc.Unsafe and have a bit more insight.
Unsafe has at least two ways of dealing with memory.
allocateMemory/reallocateMemory/freeMemory/etc -- As far as I can tell this allocation of memory is outside the heap so faces no GC'ing challenges. I have indirectly tested this and it seems that the long returned is simply a pointer to the memory. It seems very likely that this type of memory is safe to pass through JNI to native code. And the application Java code should be able to quickly modify/query it before and after JNI calls by using some of the other intrinsic Unsafe methods that support this style of memory pointer.
object+offset - These methods accept a pointer to an object and an "offset" token to indicate where in the object to fetch/modify the value. The objects presumably are always in the Java heap, but passing the object to these methods probably helps resolve GC complications. It does sounds like the "offset" is sometimes a "cookie" rather than an actual offset, but it also sounds like that in the case of arrays, arrayBaseOffset() returns an "offset" that one can manipulate arithmetically. I don't know if this object+offset is safe for JNI code. I don't see a method to generate a pointer directly to the Java object in the heap that one could (dangerously) pass through JNI. One could pass an object and offset, but given the cost of passing Objects through JNI, this approach is not appealing anyway.
Like (1), the code associated with the page I referenced in my posting is probably pretty safe for JNI interactions. It takes the object+offset approach when dealing with String, but uses approach (1) when dealing with the direct ByteBuffer, which always reside outside the Java heap. Direct ByteBuffer's are very JNI friendly and often they can be used in ways that avoids the JNI Object passing costs I allude to in my comment to Tom above.
Related
This is something that recently crossed my mind, quoting from wikipedia: "To initialize a function pointer, you must give it the address of a function in your program."
So, I can't make it point to an arbitrary memory address but what if i overwrite the memory at the address of the function with a piece of data the same size as before and than invoke it via pointer ? If such data corresponds to an actual function and the two functions have matching signatures the latter should be invoked instead of the first.
Is it theoretically possible ?
I apologize if this is impossible due to some very obvious reason that i should be aware of.
If you're writing something like a JIT, which generates native code on the fly, then yes you could do all of those things.
However, in order to generate native code you obviously need to know some implementation details of the system you're on, including how its function pointers work and what special measures need to be taken for executable code. For one example, on some systems after modifying memory containing code you need to flush the instruction cache before you can safely execute the new code. You can't do any of this portably using standard C or C++.
You might find when you come to overwrite the function, that you can only do it for functions that your program generated at runtime. Functions that are part of the running executable are liable to be marked write-protected by the OS.
The issue you may run into is the Data Execution Prevention. It tries to keep you from executing data as code or allowing code to be written to like data. You can turn it off on Windows. Some compilers/oses may also place code into const-like sections of memory that the OS/hardware protect. The standard says nothing about what should or should not work when you write an array of bytes to a memory location and then call a function that includes jmping to that location. It's all dependent on your hardware and your OS.
While the standard does not provide any guarantees as of what would happen if you make a function pointer that does not refer to a function, in real life and in your particular implementation and knowing the platform you may be able to do that with raw data.
I have seen example programs that created a char array with the appropriate binary code and have it execute by doing careful casting of pointers. So in practice, and in a non-portable way you can achieve that behavior.
It is possible, with caveats given in other answers. You definitely do not want to overwrite memory at some existing function's address with custom code, though. Not only is typically executable memory not writeable, but you have no guarantees as to how the compiler might have used that code. For all you know, the code may be shared by many functions that you think you're not modifying.
So, what you need to do is:
Allocate one or more memory pages from the system.
Write your custom machine code into them.
Mark the pages as non-writable and executable.
Run the code, and there's two ways of doing it:
Cast the address of the pages you got in #1 to a function pointer, and call the pointer.
Execute the code in another thread. You're passing the pointer to code directly to a system API or framework function that starts the thread.
Your question is confusingly worded.
You can reassign function pointers and you can assign them to null. Same with member pointers. Unless you declare them const, you can reassign them and yes the new function will be called instead. You can also assign them to null. The signatures must match exactly. Use std::function instead.
You cannot "overwrite the memory at the address of a function". You probably can indeed do it some way, but just do not. You're writing into your program code and are likely to screw it up badly.
This is a terrible idea, but i'm seeing if it's even feasible before I walk down this road.
I have to write a Win32 C++ program that can dynamically load a library based on a file that has serialized information on what dll, function, signature, and arguements to use. Loading the library is trivial (LoadLibraryEx works fine). Then getting the function pointer is easy (not a big deal, the GetProcAdderss takes care of this). However the rest is tricky.
Here's my plan of attack, feel free to let me know if this isn't the best approach:
Open the serialized information from a file on what DLL to load, and what function to execute.
LoadLibraryEx to bring in the DLL
GetProcAddress to get the function pointer (after casting the byte array to a string)
Write the arguments (which are read in as a byte array) to memory in bytes.
Get the address to the beginning of each argument (i'll know from serialization what the size of each argument is).
Using assembly jump to the beginning of the function pointer, push the addresses on the heap to the arguments in the stack (in reverse order).
Execute and get back the return value address (as a void * ?)
Use the memory address of the return value (that I got from assembly) and the size (which I got from the serialization) of the return type value and write the raw bytes back to a file.
Keep in mind my limitations:
I will never know except for run-time what the signature, dll, function name is.
It is always read in from a file.
Is there a better approach, will this approach even work?
Update
For anyone who comes poking in this thread to learn more, I found a solution. In C you can dynamically load a library using dlopen (there's a winlib of this for ease of use). Once loaded you can dynamically execute functions using libffi (supports mac/ios/win 64 and 32bit). This only gets you to C functions and primitive types (pointer,uint,int,double,float) and thats about it. However using macosx objective-c bridge you can access objective-c by loading libobjc (osx's native obj-c to c "toll free" bridge). Then through that dynamically create obj-c and c++ classes. A similar technique can be done on windows using C# and its marshaling capabilities.
This ends up with HIGH overhead, and you must be VERY careful about your memory, in addition don't mix pointers from C/C#/C++. Finally, whatever you do, at runtime. BE ABSOLUTELY SURE YOU KNOW YOUR TYPES.... seriously. BTW, libffi/cinvoke, amazing libraries.
There are existing libraries that can do what you describe, such as C/Invoke:
http://www.nongnu.org/cinvoke/
General rule, that if you have a terrible idea, drop it and find a good one.
If the signature is not known at all, what you describe will fall on face. Suppose your call works for my function as it is. I change the function from __stdcall to __cdecl or back, and you will crash.
Also you don't handle the return.
If you relax the "unknown" to allow some limitations, like fixing a return type and calling convention, you are somewhat ahead -- then really you can emulate the call but what is it good for? This whole thing sounds like a self-service hack-me engine.
The normal way is to publish some fixed interface (using function signatures), and make the DLL support it. Or arrange some uniform data transfer mechanism.
I'd suggest you to describe what you're after really and post ask that, maybe on Programmers SE.
if I allocate some memory using malloc() is there a way to mark it readonly. So memcpy() fails if someone attempt to write to it?
This is connected to a faulty api design where users are miss-using a const pointer returned by a method GetValue() which is part of large memory structure. Since we want to avoid copying of large chunk of memory we return live pointer within a structured memory which is of a specific format. Now problem is that some user find hack to get there stuff working by writing to this memory directly and avoid SetValue() call that does allocation and properly handing in memory binary format that we have developed. Although there hack sometime work but sometime it causes memory access violation due to incorrect interpretation of control flags which has been overridden by user.
Educating user is one task but let say for now we want there code to fail.
I am just wondering if we can simply protect against this case.
For analogy assume someone get a blob column from sqlite statement and then write back to it. Although in case of sqlite it will not make sense but this somewhat happing in our case.
On most hardware architectures you can only change protection attributes on entire memory pages; you can't mark a fragment of a page read-only.
The relevant APIs are:
mprotect() on Unix;
VirtualProtect() on Windows.
You'll need to ensure that the memory page doesn't contain anything that you don't want to make read-only. To do this, you'll either have to overallocate with malloc(), or use a different allocation API, such as mmap(), posix_memalign() or VirtualAlloc().
Depends on the platform. On Linux, you could use mprotect() (http://linux.die.net/man/2/mprotect).
On Windows you might try VirtualProtect() (http://msdn.microsoft.com/en-us/library/windows/desktop/aa366898(v=vs.85).aspx). I've never used it though.
Edit:
This is not a duplicate of NPE's answer. NPE originally had a different answer; it was edited later and mprotect() and VirtualProtect() were added.
a faulty api design where users are miss-using a const pointer returned by a method GetValue() which is part of large memory structure. Since we want to avoid copying of large chunk of memory we return live pointer within a structured memory which is of a specific format
That is not clearly a faulty API design. An API is a contract: you promise your class will behave in a particular way, clients of the class promise to use the API in the proper manner. Dirty tricks like const_cast are improper (and in some, but not all cases, have undefined behaviour).
It would be faulty API design if using const_cast lead to a security issue. In that case you must copy the chunk of memory, or redesign the API. This is the norm in Java, which does not have the equivalent of const (despite const being a reserved word in Java).
Obsfucate the pointer. i.e. return to the client the pointer plus an offset, now they can't use the pointer directly.
whenever the pointer is passed to your code via the official API, subtract the offset and use the pointer as usual.
I'm porting a lot of math. I'm using over to c++ from java and seeing a great performance boost from doing so but I cant figure out what jni function to use in order to get rid of variables that I don't need anymore. For instance I know that when your jni method comes to it end and you've been using jfloatArray you call :
env->ReleaseFloatArrayElements(vec,in,0);
And that would destroy the array and free up memory. I'd like to be able to do the same with single primitives that aren't array types if possible but I've looked through the oracle and sun docs and there is no methods to do such a thing......should I just use the default way to destroy objects using c++ or is there a safe sure fire way to do such a thing.
There's nothing necessary. You only have to clean up in cases where the
JNI interface may have allocated memory or other resources. Basic
types, like jfloat , are typedef's for basic C++ types (usually,
float), and are passed around by copy; when you declare a jfloat,
it's just a floating point type on the stack, and disappears when you
leave its scope. The types you have to clean up will normally be
pointers; the clean-up functions are there to free up the memory the
pointer points to.
I have an unmanaged C++ library. I would like to expose the functionality for .NET applications. There's one partucular function I am not sure how to handle:
typedef void (free_fn*) (void*);
void put (void *data, free_fn deallocation_function);
The idea is that you pass dynamically allocated buffer to the function and supply a deallocation function. The library will process the data asynchronously and will release the buffer later on when data is no longer needed:
void *p = malloc (100);
... fill in the buffer...
put (p, free);
How can I expose this kind of thing to .NET applications?
Be very careful when you do this. .NET really, really wants to have its objects be pinned on the way into an unmanaged routine and unpinned on the way out. If your unmanaged code holds onto a pointer value, that had been pinned on the way in then there is very real chance that the memory will be moved or garbage collected or both.
This is especially the case with delegates marshalled to function pointers (trust me on this - I found that marshaled delegates were being garbage collected on me - I had people at Microsoft verify that for me). The ultimate solution to this problem is to stash away copies of your delegates in a static table paired with a unique transaction id, then create an unmanaged function that when called looks up the delegate in the table via transaction id then executes it. It's ugly and if I had another choice, I would've used it.
Here's the best way to do this in your case - since your unmanaged code uses a set it and forget it model, then you should make your API chunkier. Create an wrapper in managed C++ that allocates memory via an unmanaged routine, copies your data into it and then passes it on along with a pointer to an unmanaged deallocator.
In general, .NET consumers of your library won't be passing dynamically created arrays to your functions. As far as I know, all containers in .NET are garbage collected.
Regardless, you will need to make a managed wrapper for your unmanaged code. There are many tutorials and articles on this, here is one to start with.
When writing .NET wrappers for unamanged code, I've found that you want to concentrate more on preserving functionality than on making every function accessible in .NET. In your example, it may be better to just have the managed wrapper copy the array into unmanaged memory and perform whatever operations you need to inside the library. This way you don't have to do any pinning of managed memory or Marshalling of managed to unmanaged memory in order to circumvent the .NET runtime's garbage collection. However, how you implement the managed wrapper really depends on what the purpose of that function is.
If you really want to implement this function for function in .NET, you will need to look at the Marshal class in .NET for taking control of managed memory in unmanaged code.
For your callback function, you will first need to create .NET delegates that can be assigned in managed code. You will then need to make an unmanaged free function internal to your library that is called by the unmanaged version of the put function. This unmanaged free function will then be responsible for calling the managed delegate, if the user assigned one.
You definitely don't want to pin the managed buffer, as trying to deallocate it in unmanaged code seems like the shortest route to madness. If you can't rewrite this portion in fully managed code, your best bet is either going to be making a copy of the data in the wrapper, or completely hiding the buffer management from the managed world.
If you had the guts (and the masochistic stamina) you could pin the buffer in the wrapper, then pass in the marshaled delegate of a managed function that unpins the buffer. However, I wouldn't suggest it. Having had to do a couple of managed wrappers has taught me the value of exposing the absolute minimum unmanaged functionality, even if it means you have to rewrite some things in managed code. Crossing that boundary is about as easy as going from East Germany to West Germany used to be, to say nothing of the performance hits.
Most replies suggest that the data should be copied from managed buffer to unmanaged buffer. How exactly would you do that? Is following implementation OK?
void managed_put (byte data_ __gc[], size_t size_)
{
// Pin the data
byte __pin *tmp_data = &data_[0];
// Copy data to the unmanaged buffer.
void *data = malloc (size_);
memcpy (data, (byte*) tmp_data, size_);
// Forward the call
put (data, size_, free);
}
Some of the previous poster's have been using MC++, which is deprecated. C++/CLI is far more elegant of a solution.
The BEST, technique for interop, is implicit interop, not explicit. I dont believe anybody has commented on this yet. However, it gives you the ability to marshal your types from managed<->native where if you make a change to your type definition or structure layout, it will not result in a breaking change (which explicit interop does).
This wikiepedia article documents some of the differences and is a good starting point for further information.
P/Invoke (explicit and implicit)
Also, the site marshal-as.net has some examples and information as to this newer method (again, more ideal as it will not break your code if the a native struct is re-defined).
You'd have to have managed wrappers for the functions themselves (or unmanaged wrappers if you want to pass in managed functions). Or else, treat the unmanaged function pointers as opaque handles in the managed world.
Since you mentioned it was asyncronous, I'd do it this way.
The .Net exposed function only takes the data but doesn't take a delegate. Your code passes the pinned data and a function pointer to a function that will simply unpin the data. This leaves the memory cleanup to the GC, but makes sure the it won't clean it up till the asyncronous part is done.