Can a stack allocated Rust buffer be accessed through C++? - c++

In order to avoid head allocations, and since I know the maximum MTU of an ethernet packet, I created a small buffer: [u8, MAX_BYTES_TRANSPORT] in Rust, that C++ should fill for me:
pub fn receive(&mut self, f: &dyn Fn(&[u8])) -> std::result::Result<(), &str> {
let mut buffer: [u8; MAX_BYTES_TRANSPORT] = [0; MAX_BYTES_TRANSPORT];
//number of bytes written in buffer in the C++ side
let written_size: *mut size_t = std::ptr::null_mut::<size_t>();
let r = unsafe{openvpn_client_receive_just(
buffer.as_mut_ptr(), buffer.len(), written_size, self.openvpn_client)};
So, the function openvpn_client_receive_just, which is a C++ function with C interface, should write to this buffer. Is this safe? I couldn't find information about a stack allocated Rust buffer being used in C++
This is the function:
uint8_t openvpn_client_receive_just(
uint8_t *buffer, size_t buffer_size, size_t *written_size, OpenVPNSocket *client)

Can a stack allocated buffer be accessed through C++?
Yes.
From the type-system perspective there is no difference between statically allocated, stack allocated, or heap allocated: the C signature only takes pointer and size, and cares little where that pointer points to.
Is this safe?
Most likely.
As long as the C function is correctly written and respects the bounds of the buffer, this will be safe. If it doesn't, well, that's a bug.
One could argue that it's better to have a heap-allocated buffer, but honestly once one starts writing out of bounds, overwriting arbitrary stack bytes or overwriting arbitrary heap bytes are both bad, and have undefined behavior.
For extra security, you could use a heap allocated nested between 2 guard pages. Using OS specific facilities, you could allocate 3 contiguous OS pages (typically 4KB each on x86), then mark the first and last as read-only and put your buffer in the middle one. Then any (close) write before or after the buffer would be caught by the OS. Larger jumps, though, wouldn't... so it's a lot of effort for a mitigation.
Is your code safe?
You most likely need to know how many bytes were written, so using a null pointer is strange.
I'd expect to see:
let mut written: size_t = 0;
let written_size = &mut written as *mut _;
And yes, that's once again a pointer to a stack variable, just like you would in C.
A note on style. Your Rust code is unusual in that you use fully typed variables and full paths, a more idiomatic style would be:
// Result is implicitly in scope.
pub fn receive(&mut self, f: &dyn Fn(&[u8])) -> Result<(), &str) {
let mut buffer = [0u8; MAX_BYTES_TRANSPORT];
let mut written: size_t = 0;
let written_size = &mut written as *mut _;
// Safety:
// <enumerate preconditions to safely call the function here, and why they are met>
let result = unsafe {
openvpn_client_receive_just(
buffer.as_mut_ptr(), buffer.len(), written_size, self.openvpn_client)
};
translate_openvpn_error(result)?;
let buffer = &buffer[0..written];
f(buffer);
Ok(())
}
I did annotate the type for written, to help along inference, but strictly speaking it should not be necessary.
Also, I like to preface every unsafe call I make with the list of pre-conditions that make it safe, and for each why they are met. It helps me audit my unsafe code later on.

Related

Memory allocate in c++

I have a project in which I have to allocate 1024 bytes when my program starts. In C++ program.
void* available = new char*[1024];
I write this and I think it is okay.
Now my problem starts, I should make a function that receives size_t size (number of bytes) which I should allocate. My allocate should return a void* pointer to the first bytes of this available memory. So my question is how to allocate void* pointer with size and to get memory from my available.
I'm a student and I'm not a professional in C++.
Also sorry for my bad explanation.
It looks like you're trying to make a memory pool. Even though that's a big topic let's check what's the minimal effort you can pour to create something like this.
There are some basic elements to a pool that one needs to grasp. Firstly the memory itself, i.e. where do you draw memory from. In your case you already decided that you're going to dynamically allocate a fixed amount of memory. To do it properly the the code should be:
char *poolMemory = new char[1024];
I didn't choose void* pool here because delete[] pool is undefined when pool is a void pointer. You could go with malloc/free but I'll keep it C++. Secondly I didn't allocate an array of pointers as your code shows because that allocates 1024 * sizeof(char*) bytes of memory.
A second consideration is how to give back the memory you acquired for your pool. In your case you want to remember to delete it so best you put it in a class to do the RAII for you:
class Pool
{
char *_memory;
void *_pool;
size_t _size;
public:
Pool(size_t poolSize = 1024)
: _memory(new char[poolSize])
, _pool(_memory)
, _size(poolSize)
{
}
~Pool() { delete[] _memory; } // Forgetting this will leak memory.
};
Now we come to the part you're asking about. You want to use memory inside that pool. Make a method in the Pool class called allocate that will give back n number of bytes. This method should know how many bytes are left in the pool (member _size) and essentially performs pointer arithmetic to let you know which location is free. There is catch unfortunately. You must provide the required alignment that the resulting memory should have. This is another big topic that judging from the question I don't think you intent to handle (so I'm defaulting alignment to 2^0=1 bytes).
#include <memory>
void* Pool::allocate(size_t nBytes, size_t alignment = 1)
{
if (std::align(alignment, nBytes, _pool, _size))
{
void *result = _pool;
// Bookkeeping
_pool = (char*)_pool + nBytes; // Advance the pointer to available memory.
_size -= nBytes; // Update the available space.
return result;
}
return nullptr;
}
I did this pointer arithmetic using std::align but I guess you could do it by hand. In a real world scenario you'd also want a deallocate function, that "opens up" spots inside the pool after they have been used. You'd also want some strategy for when the pool has run out of memory, a fallback allocation. Additionally the initially memory acquisition can be more efficient e.g. by using static memory where appropriate. There are many flavors and aspects to this, I hope the initial link I included gives you some motivation to research a bit on the topic.

(Rust question) C++ double pointer to void meaning

I'm trying to work with active directory from Rust by following the c++ examples Microsoft posts for the ADSI API and the Windows-RS crate. I'm not understanding quite what is going on here:
https://learn.microsoft.com/en-us/windows/win32/api/adshlp/nf-adshlp-adsopenobject
They create an uninitialized pointer to IADs (drawing from my c# knowledge, it looks like an interface) then, when it comes time to use it, they have a double pointer that is cast as void. I tried to replicate this behavior in Rust, but I'm thinking I'm just not understanding exactly what is happening. This is what I've tried so far:
// bindings omitted
use windows::Interface;
use libc::c_void;
fn main() -> windows::Result<()> {
let mut pads: *mut IADs = ptr::null_mut();
let ppads: *mut *mut c_void = pads as _;
unsafe {
let _ = CoInitialize(ptr::null_mut());
let mut ldap_root: Vec<u16> = "LDAP://rootDSE\0".encode_utf16().collect();
let hr = ADsOpenObject(
ldap_root.as_mut_ptr() as _,
ptr::null_mut(),
ptr::null_mut(),
ADS_AUTHENTICATION_ENUM::ADS_SECURE_AUTHENTICATION.0 as _,
& IADs::IID,
ppads,
);
if !hr.is_err() {
...
}
}
Ok(())
}
First, I'm probably wrong to be creating a null pointer because that's not what they're doing in the example, but the problem is that rust doesn't permit the use of an uninitialized variable, so I'm not sure what the equivalent is.
Second, presumably the pADs variable is where the output is supposed to go, but I'm not understanding the interaction of having a pointer, then a double pointer, to an object that doesn't have an owner. Even if that were possible in rust, I get the feeling that it's not what I'm supposed to do.
Third, once I have the pointer updated by the FFI call, how do I tell Rust what the resulting output type is so that we can do more work with it? Doing as _ won't work because it's a struct, and I have a feeling that using transmute is bad
Pointer parameters are often used in FFIs as a way to return data alongside the return value itself. The idea is that the pointer should point to some existing object that the call will populate with the result. Since the Windows API functions often return HRESULTs to indicate success and failure, they use pointers to return other stuff.
In this case, the ADsOpenObject wants to return a *void (the requested ADs interface object), so you need to give it a pointer to an existing *void object for it to fill:
let mut pads: *mut c_void = std::ptr::null_mut();
let ppads = &mut pads as *mut *mut c_void;
// or inferred inline
let hr = ADsOpenObject(
...
&mut pads as _,
);
I changed pads to *mut c_void to simplify this demonstration and match the ADsOpenObject parameters. After a successful call, you can cast pads to whatever you need.
The key difference is casting pads vs &mut pads. What you were doing before was making ppads the same value as pads and thus telling the function that the *void result should be written at null. No good. This makes the parameter point to pads instead.
And the uninitialized vs null difference is fairly moot because the goal of the function is to overwrite it anyways.

How to handle char * from packed struct in cgo?

Since Go doesn’t support packed struct I found this great article explains everything with examples how to work with packed struct in go. https://medium.com/#liamkelly17/working-with-packed-c-structs-in-cgo-224a0a3b708b
The problem is when I try char * in place of [10]char it's not working. I'm not sure how this conversion works with [10]char and not with char * . Here is example code taken from above article and modified with char * .
package main
/*
#include "stdio.h"
#pragma pack(1)
typedef struct{
unsigned char a;
char b;
int c;
unsigned int d;
char *e; // changed from char[10] to char *
}packed;
void PrintPacked(packed p){
printf("\nFrom C\na:%d\nb:%d\nc:%d\nd:%d\ne:%s\n", p.a, p.b, p.c, p.d, p.e);
}
*/
import "C"
import (
"bytes"
"encoding/binary"
)
//GoPack is the go version of the c packed structure
type GoPack struct {
a uint8
b int8
c int32
d uint32
e [10]uint8
}
//Pack Produces a packed version of the go struct
func (g *GoPack) Pack(out *C.packed) {
buf := &bytes.Buffer{}
binary.Write(buf, binary.LittleEndian, g)
*out = *(*C.packed)(C.CBytes(buf.Bytes()))
}
func main() {
pack := &GoPack{1, 2, 3, 4, [10]byte{}}
copy(pack.e[:], "TEST123")
cpack := C.packed{} //just to allocate the memory, still under GC control
pack.Pack(&cpack)
C.PrintPacked(cpack)
}
I'm working with cgo first time so correct me if i am wrong at any point.
You are writing ten (zero) bytes of GoPack.e into the packed.e which is of type char *. This won't work, because pointers will be 4 or 8 bytes depending on your system, so even if the bytes represented a valid pointer, you are overflowing the amount of memory allocated.
If you want to create a valid structure with a valid packed.e field, you need to allocate 10 bytes of memory in the C heap, copy the bytes into that, and then point packed.e to this allocated memory. (You will also need to free this memory when you free the corresponding packed structure). You can't do this directly with binary.Write.
You can take this as a starting point:
buf := &bytes.Buffer{}
binary.Write(buf, binary.LittleEndian, g.a)
binary.Write(buf, binary.LittleEndian, g.b)
binary.Write(buf, binary.LittleEndian, g.c)
binary.Write(buf, binary.LittleEndian, g.d)
binary.Write(buf, binary.LittleEndian, uintptr(C.CBytes(g.e))
*out = *(*C.packed)(C.CBytes(buf.Bytes()))
The function C.CBytes(b) allocates len(b) bytes in the C heap, and copies the bytes from b into it, returning an unsafe.Pointer.
Note that I've copied your *out = *(*C.packed)... line from your code. This actually causes a memory leak and an unnecessary copy. Probably it would be better to use a writer that writes bytes directly to the memory pointed to by out.
Perhaps this?
const N = 10000 // should be sizeof(*out) or larger
buf := bytes.NewBuffer((*[N]byte)(unsafe.Pointer(out))[:])
This makes a bytes.Buffer that directly writes to the out struct without going through any intermediate memory. Note that because of unsafe shenanigans, this is vulnerable to a buffer overflow if you write more bytes of data than is pointed to by out.
Words of warning: this is all pretty nasty, and prone to the same sorts of problems you'd find in C, and you'd need to check the cgo pointer rules to make sure that you're not vulnerable to garbage collection interactions. A point of advice: given that you say you "don't have much experience with pointers and memory allocation", you probably should avoid writing or including code like this because the problems it can introduce are nefarious and may not be immediately obvious.

Which is the safer way to store `uint8_t*` C buffers in Rust?

I'm doing
#[no_mangle]
pub extern "C" fn receiveBufferAndPrint(buffer: *const u8, size: usize)
{
for i in 0..size {
println!("{}", unsafe { *buffer.offset(i as isize) });
}
}
To receive an uint8_t* buffer from C.
What is the safest way to convert this buffer into a Rust object that deletes this memory when it gos out of scope? I need to deal with buffers safely on Rust but I don't want to copy the buffer element by element into a new Rust object, I want to wrap it into a Rust object that deletes it when it goes out of scope.
If C is lending you the buffer, then use std::slice::from_raw_parts, and then you can copy that slice to Vec<T> or Box<[T]> if you want to keep the data.
If C is passing ownership of the buffer (i.e it's heap-allocated and C won't free it), then you can use c_vec crate to ensure it's freed properly (using libc::free() instead of Rust's own allocator).

How to fill buffers with mixed types conveniently in standard conformant way?

There are problems, where we need to fill buffers with mixed types. Two examples:
programming OpenGL/DirectX, we need to fill vertex buffers, which can have mixed types (which is basically an array of struct, but the struct maybe described by a run-time data)
creating a memory allocator: putting header/trailer information to the buffer (size, flags, next/prev pointer, sentinels, etc.)
The problem can be described like this:
there is an allocation function, which gives back some memory (new, malloc, OS dependent allocation function, like mmap or VirtualAlloc)
there is a need to put mixed types into an allocated buffer, at various offsets
A solution can be this, for example writing an int to an offset:
void *buffer = <allocate>;
int offset = <some_offset>;
char *ptr = static_cast<char*>(buffer);
*reinterpret_cast<int*>(ptr+offset) = int_value;
However, this is inconvenient, and has UB at least two places:
ptr+offset is UB, as there is no char array at ptr
writing to the result of reinterpret_cast is UB, as there is no int there
To solve the inconvenience problem, this solution is often used:
union Pointer {
void *asVoid;
bool *asBool;
byte *asByte;
char *asChar;
short *asShort;
int *asInt;
Pointer(void *p) : asVoid(p) { }
};
So, with this union, we can do this:
Pointer p = <allocate>;
p.asChar += offset;
*p.asInt++ = int_value; // write an int to offset
*p.asShort++ = short_value; // then a short afterwards
// other writes here
This solution is convenient for filling buffers, but has further UB, as the solution uses non-active union members.
So, my question is: how can one solve this problem in a strictly standard conformant, and most convenient way? I mean, I'd like to have the functionality which the union solution gives me, but in a standard conformant way.
(Note: suppose, that we have no alignment issues here, alignment is taken care of by using proper offsets)
A simple (and conformant) way to handle these things is leveraging std::memcpy to move whatever values you need into the correct offsets in your storage area, e.g.
std::int32_t value;
char *ptr;
int offset;
// ...
std::memcpy(ptr+offset, &value, sizeof(value));
Do not worry about performance, since your compiler will not actually perform std::memcpy calls in many cases (e.g. small values). Of course, check the assembly output (and profile!), but it should be fine in general.