Translating Cpp to Rust, handling a global static object - c++

I'm a beginner translating a familiar Cpp project to Rust. The project contains a class called Globals which stores global config parameters. Here is an extract from its cpp file:
static Globals &__get()
{
static Globals globals;
return globals;
}
const Globals &Globals::get()
{
auto &globals = __get();
if (!globals.initialized) {
throw std::runtime_error("Initialize globals first");
}
return globals;
}
void Globals::set(const Globals &globals)
{
__get() = globals;
}
How would I translate this to Rust? From what I can tell __get() implements some kind of singleton logic. I've read about the lazy_static crate to achieve something similar, but unlocking the variable each time I want to read its value seems to be too verbose. Isn't it possible to achieve this using an interface like Globals::get() in the cpp code.
I rarely post, so if I forgot something, just tell me and I'll provide details. Thanks!

I've read about the lazy_static crate to achieve something similar, but unlocking the variable each time I want to read its value seems to be too verbose.
For a good reason: "safe rust" includes what you'd call thread-safety by design. An unprotected mutable global is wildly unsafe.
Which... is why interacting with a mutable static requires unsafe (whether reading or writing).
The translation from C++ is quite straightforward[0] and can easily be inferred from the reference section on statics, the only real divergence is that a Rust static must be initialised.
Also note that if you do not need mutation then you can lazy_static! a readonly value (you don't need the mutex at all), and don't need to unlock anything.
[0] though much simplified by doing away with the unnecessary __get

Rust requires memory safety in safe code - thus, you cannot have a mutable static in safe code. You CAN have an atomic static (see AtomicBool or AtomicU64 for examples), but for a normal type, you will need some sort of locking mechanism, such as an RwLock or Mutex (if performance is your thing, the parking_lot crate provides more performant implementations than the Rust standard library)
If you don't want to handle locking yourself, may I suggest making a wrapper object using getter/setter methods?
use std::sync::{Arc, RwLock};
use once_cell::sync::Lazy;
static GLOBAL: Lazy<Global> = Lazy::new(Global::new);
struct GlobalThingymajig {
pub number: u32,
pub words: String,
}
pub struct Global(Arc<RwLock<GlobalThingymajig>>);
impl Global {
pub fn new() -> Self {
Self(Arc::new(RwLock::new(
GlobalThingymajig {
number: 42,
words: "The Answer to Life, The Universe, and Everything".into()
}
)))
}
pub fn number(&self) -> u32 {
self.0.read().unwrap().number
}
pub fn words(&self) -> String {
self.0.read().unwrap().words.clone()
}
pub fn set_number(&self, new_number: u32) {
let mut writer = self.0.write().unwrap();
writer.number = new_number;
}
pub fn set_words(&self, new_words: String) {
let mut writer = self.0.write().unwrap();
writer.words = new_words;
}
}
You can see this example on the Rust Playground here

Related

What is the advantage to use OpenGL methods in Rust via a struct rather than accessing them globally?

I am currently porting an OpenGL application written in C++ to Rust and came across a design question that may have other implications since I am pretty new to Rust.
As of now I am using a combination of glutin and gl_generator. Generating global functions in the bindings module yields the following API:
gl::load_with(|symbol| context.get_proc_address(symbol));
// anywhere else since it's global
gl::CreateProgram();
This is pretty much the same paradigm that I used for my wrapper classes in C++ that access global methods:
// ...
FShader::~FShader() {
if (Handle != 0) {
glDeleteProgram(Handle);
}
}
auto FShader::bind() const -> void { glUseProgram(Handle); }
// ...
After reading further posts on the internet I noticed some people suggesting to use the struct generator and work with an objects rather than global methods.
let gl = Rc::new(gl::Gl::load_with(|symbol| context.get_proc_address(symbol)));
// ...
pub struct Shader {
gl: Rc<gl::Gl>,
handle: u32,
}
impl Drop for Shader {
fn drop(&mut self) {
unsafe {
self.gl.DeleteProgram(self.handle);
}
}
}
impl Shader {
pub fn new(gl: Rc<gl::Gl>) -> Result<Shader, ()> {
let handle = unsafe { gl.CreateProgram() };
// ...
Ok(Shader { gl, handle })
}
}
This approach is definitely more verbose since all wrapper classes would have to store a ref to the gl::Gl object to be able to release resources for instance. The signature of Drop::drop seems to be the reason why sharing the reference across objects is necessary. Are there any other advantages for doing this? Is the use of a Rc the best approach or is it more reasonable to work with regular references?

"error: underscore lifetimes are unstable" when implementing From<std::sync::PoisonError>

I'm working on a function that looks like this:
fn do_stuff(&mut self, a: MyStruct) -> Result<(), MyError> {
let x = try!(serde_json::to_vec(a));
let cache = Arc::clone(self.data); // Get shared reference
{
let cache = try!(cache.lock()); // Get lock
cache.push(x);
}
/* Do stuff with other resources */
Ok(())
}
Where the definition of MyError is:
#[derive(Debug)]
pub enum MyError {
Serialization(serde_json::Error),
Synch(PoisonError<MutexGuard<'_, Vec<u8>>>),
}
Before I even get to implementing From<std::sync::PoisonError> for MyError, the compiler already tells me the definition of the Synch variant of my enum is wrong:
error: underscore lifetimes are unstable (see issue #44524)
The declaration using underscore lifetimes actually came from an earlier hint from the compiler when I was trying to figure out the error I should convert from when the lock operation fails. I read the aforementioned issue and that doesn't help me.
What's the full type I should be converting from in order to catch the error from the Mutex::lock operation?
Like so:
#[derive(Debug)]
pub enum MyError<'a> {
Serialization(serde_json::Error),
Synch(PoisonError<MutexGuard<'a, Vec<u8>>>),
}
The closest explanation I can find in the book is the section on Lifetime Annotations in Struct Definitions (enums behave the same way).
The compiler suggesting unstable syntax as a solution is quite unfair.

is the a practical way to emulate GO language defer in C or C++ destructors?

In short: it is a smart pointers in C question. Reason: embedded programming and need to ensure that if complex algorithm is used, then proper deallocation occurs with little effort on the developer side.
My favorite feature of C++ is ability to execute a proper deallocation of object allocated on stack and that goes out of scope. GO language defer provides same functionality and it is a bit closer in spirit to C.
GO defer would be the desired way of doing things in C. Is there a practical way to add such functionality?
The goal of doing so is simplification of tracking when and where object goes out of scope. Here is a quick example:
struct MyDataType *data = malloc(sizeof(struct MyDataType));
defer(data, deallocator);
if (condition) {
// dallocator(data) is called automatically
return;
}
// do something
if (irrelevant) {
struct DT *localScope = malloc(...);
defer(localScope, deallocator);
// deallocator(localScope) is called when we exit this scope
}
struct OtherType *data2 = malloc(...);
defer(data2, deallocator);
if (someOtherCondition) {
// dallocator(data) and deallocator(data2) are called in the order added
return;
}
In other languages I could create an anonymous function inside the code block, assign it to the variable and execute manually in front of every return. This would be at least a partial solution. In GO language defer functions can be chained. Manual chaining with anonymous functions in C is error prone and impractical.
Thank you
In C++, I've seen "stack based classes" that follow the RAII pattern. You could make a general purpose Defer class (or struct) that can take any arbitrary function or lambda.
For example:
#include <cstddef>
#include <functional>
#include <iostream>
#include <string>
using std::cout;
using std::endl;
using std::function;
using std::string;
struct Defer {
function<void()> action;
Defer(function<void()> doLater) : action{doLater} {}
~Defer() {
action();
}
};
void Subroutine(int i) {
Defer defer1([]() { cout << "Phase 1 done." << endl; });
if (i == 1) return;
char const* p = new char[100];
Defer defer2([p]() { delete[] p; cout << "Phase 2 done, and p deallocated." << endl; });
if (i == 2) return;
string s = "something";
Defer defer3([&s]() { s = ""; cout << "Phase 3 done, and s set to empty string." << endl; });
}
int main() {
cout << "Call Subroutine(1)." << endl;
Subroutine(1);
cout << "Call Subroutine(2)." << endl;
Subroutine(2);
cout << "Call Subroutine(3)." << endl;
Subroutine(3);
return EXIT_SUCCESS;
}
Many different answers, but a few interesting details was not said.
Of course destructors of C++ are very strong and should be used very often. Sometime some smart pointers could help you. But the mechanism, that is the most resemble to defer is ON_BLOCK_EXIT/ON_BLOCK_EXIT_OBJ (see http://http://www.drdobbs.com/cpp/generic-change-the-way-you-write-excepti/184403758 ). Do not forgot to read about ByRef.
One big difference between C++ and go is when deffered is called. In C++ when your program leaving scope, where is was created. But in go when your program leaving function. That means, this code won't work at all:
for i:=0; i < 10; i++ {
mutex.Lock()
defer mutex.Unlock()
/* do something under the mutex */
}
Of course C does not pretends that is object oriented and therefore there are no destructors at all. It help a lot of readability of code, because you know that your program at line X do only what is written in that line. In contrast of C++ where each closing curly bracket could cause calling of dozens destructors.
In C you can use hated statement goto. Don't use it for anything else, but it is practical to have cleanup label at the end of function call goto cleanup from many places. Bit more complicated is when more than one resource you want do release, than you need more that one cleanup. Than your function finish with
cleanup_file:
fclose(f);
cleanup_mutex:
pthread_mutex_unlock(mutex);
return ret;
}
C does not have destructors (unless you think of the GCC specific variable attribute cleanup, which is weird and rarely used; notice also that the GCC function attribute destructor is not what other languages, C++ notably, call destructor). C++ have them. And C & C++ are very different languages.
In C++11, you might define your class, having a std::vector or std::function-s, initialized using a std::initialized_list of lambda expressions (and perhaps dynamically augmented by some push_back). Then its destructor could mimic Go's defer-ed statements. But this is not idiomatic.
Go have defer statements and they are idiomatic in Go.
I recommend sticking to the idioms of your programming languages.
(In other words: don't think in Go while coding in C++)
You could also embed some interpreter (e.g. Lua or Guile) in your application. You might also learn more about garbage collection techniques and concepts and use them in your software (in other words, design your application with its specific GC).
Reason: embedded programming and need to ensure that if complex algorithm is used, then proper deallocation occurs with little effort on the developer side.
You might use arena-based allocation techniques, and de-allocate the arena when suitable... When you think about that, it is similar to copying GC techniques.
Maybe you dream of some homoiconic language with a powerful macro system suitable for meta-programming. Then look into Common Lisp.
I just implemented a very simple thing like defer in golang several days ago.
The only one behaviour different from golang is my defer will not be executed when you throw an exception but does not catch it at all. Another difference is this cannot accept a function with multiple arguments like in golang, but we can deal it with lambda capturing local variables.
The implementations are here.
class _Defer {
std::function<void()> __callback;
public:
_Defer(_Defer &&);
~_Defer();
template <typename T>
_Defer(T &&);
};
_Defer::_Defer(_Defer &&__that)
: __callback{std::forward<std::function<void()>>(__that.__callback)} {
}
template <typename T>
_Defer::_Defer(T &&__callback)
: __callback{
static_cast<std::function<void()>>(std::forward<T>(__callback))
} {
static_assert(std::is_convertible<T, std::function<void()>>::value,
"Cannot be convert to std::function<void()>.");
}
_Defer::~_Defer() {
this->__callback();
}
And then I defined some macros to make my defer like a keyword in C++ (just for fun)
#define __defer_concatenate(__lhs, __rhs) \
__lhs##__rhs
#define __defer_declarator(__id) \
if (0); /* You may forgot a `;' or deferred outside of a scope. */ \
_Defer __defer_concatenate(__defer, __id) =
#define defer \
__defer_declarator(__LINE__)
The if (0); is used to prevent defer a function out of a scope. And then we can use defer like in golang.
#include <iostream>
void foo() {
std::cout << "foo" << std::endl;
}
int main() {
defer []() {
std::cout << "bar" << std::endl;
};
defer foo;
}
This will print
foo
bar
to screen.
GO defer would be the desired way of doing things in C. Is there a practical way to add such functionality?
The goal of doing so is simplification of tracking when and where object goes out of scope.
C does not have any built-in mechanism for automatically invoking any kind of behavior at the end of an object's lifetime. The object itself ceases to exist, and any memory it occupied is available for re-use, but there is no associated hook for executing code.
For some kinds of objects, that is entirely satisfactory by itself -- those whose values do not refer to other objects with allocated storage duration that need to be cleaned up as well. In particular, if struct MyDataType in your example is such a type, then you get automatic cleanup for free by declaring instances as automatic variables instead of allocating them dynamically:
void foo(void) {
// not a pointer:
struct MyDataType data /* = initializer */;
// ...
/* The memory (directly) reserved for 'data' is released */
}
For objects that require attention at the end of their lifetime, it is generally a matter of code style and convention to ensure that you know when to clean up. It helps, for example, to declare all of your variables at the top of the innermost block containing them, though C itself does not require this. It can also help to structure your code so that for each object that requires custom cleanup, all code paths that may execute during its lifetime converge at the end of that lifetime.
Myself, as a matter of personal best practices, I always try to write any cleanup code needed for a given object as soon as I write its declaration.
In other languages I could create an anonymous function inside the code block, assign it to the variable and execute manually in front of every return. This would be at least a partial solution. In GO language defer functions can be chained. Manual chaining with anonymous functions in C is error prone and impractical
C has neither anonymous functions nor nested ones. It often does make sense, however, to write (named) cleanup functions for data types that require cleanup. These are analogous to C++ destructors, but you must call them manually.
The bottom line is that many C++ paradigms such as smart pointers, and coding practices that depend on them, simply do not work in C. You need different approaches, and they exist, but converting a large body of existing C++ code to idiomatic C is a distinctly non-trivial undertaking.
For those using C, I’ve built a preprocessor in C (open source, Apache license) that inserts the deferred code at the end of each block:
https://sentido-labs.com/en/library/#cedro
GitHub: https://github.com/Sentido-Labs/cedro/
It includes a C utility that wraps the compiler (works out-of-the-box with GCC and clang, configurable) so you can use it as drop-in replacement for cc, called cedrocc, and if you decide to get rid of it, running cedro on a C source file will produce plain C. (see the examples in the manual)
The alternatives I know about are listed in the “Related work” part of the documentation:
Apart from the already mentioned «A defer mechanism for C», there are macros that use a for loop as for (allocation and initialization; condition; release) { actions } [a] or other techniques [b].
[a] “P99 Scope-bound resource management with for-statements” from the same author (2010), “Would it be possible to create a scoped_lock implementation in C?” (2016), ”C compatible scoped locks“ (2021), “Modern C and What We Can Learn From It - Luca Sas [ ACCU 2021 ] 00:17:18”, 2021
[b] “Would it be possible to create a scoped_lock implementation in C?” (2016), “libdefer: Go-style defer for C” (2016), “A Defer statement for C” (2020), “Go-like defer for C that works with most optimization flag combinations under GCC/Clang” (2021)
Compilers like GCC and clang have non-standard features to do this like the __cleanup__ variable attribute.
This implementation avoids dynamic allocation and most limitations of other implementations shown here
#include<type_traits>
#include<utility>
template<typename F>
struct deferred
{
std::decay_t<F> f;
template<typename G>
deferred(G&& g) : f{std::forward<G>(g)} {}
~deferred() { f(); }
};
template<typename G>
deferred(G&&) -> deferred<G>;
#define CAT_(x, y) x##y
#define CAT(x, y) CAT_(x, y)
#define ANONYMOUS_VAR(x) CAT(x, __LINE__)
#define DEFER deferred ANONYMOUS_VAR(defer_variable) = [&]
And use it like
#include<iostream>
int main()
{
DEFER {
std::cout << "world!\n";
};
std::cout << "Hello ";
}
Now, whether to allow exceptions in DEFER is a design choice bordering on philosophy, and I'll leave it to Andrei to fill in the details.
Note all such deferring functionalities in C++ necessarily has to be bound to the scope at which it is declared, as opposed to Go's which binds to the function at which it is declared.

Is it possible to export/wrap a complex Go struct to C?

I own a Go library, gofileseq, for which I would like to try and made a C/C++ binding.
It is pretty straightforward to be able to export functions that use simple types (ints, strings, ...). It is even easy enough to export data from custom Go types to C by defining a C struct and translating the Go type to it, to be used in the exported functions, since you are allocating C memory to do it. But with the go 1.5 cgo rules I am finding it difficult to figure out how to export functionality from a more complex struct that stores state.
Example of a struct from gofileseq that I would like to export somehow to a C++ binding:
// package fileseq
//
type FrameSet struct {
frange string
rangePtr *ranges.InclusiveRanges
}
func NewFrameSet(frange string) (*FrameSet, error) {
// bunch of processing to set up internal state
}
func (s *FrameSet) Len() int {
return s.rangePtr.Len()
}
// package ranges
//
type InclusiveRanges struct {
blocks []*InclusiveRange
}
type InclusiveRange struct {
start int
end int
step int
cachedEnd int
isEndCached bool
cachedLen int
isLenCached bool
}
As you can see, the FrameSet type that I want to expose contains a slice of pointers to an underlying type, each of which stores state.
Ideally, I would love to be able to store a void* on a C++ class, and make it just a simple proxy for calling back into exported Go functions with the void*. But the cgo rules disallow C storing a Go pointer longer than the function call. And I am failing to see how I could use an approach of defining C++ classes that could be allocated and used to operate with my Go library.
Is it possible to wrap complex types for exposure to C/C++?
Is there a pattern that would allow a C++ client to create a Go FrameSet?
Edit
One idea I can think of would be to let C++ create objects in Go that get stored on the Go side in a static map[int]*FrameSet and then return the int id to C++. Then all the C++ operations make requests into Go with the id. Does that sound like a valid solution?
Update
For now, I am proceeding with testing a solution that uses global maps and unique ids to store objects. C++ would request a new object to be created and only get back an opaque id. Then they can call all of the methods exported as functions, using that id, including requesting for it to be destroyed when done.
If there is a better approach than this, I would love to see an answer. Once I get a fully working prototype, I will add my own answer.
Update #2
I've written a blog post about the final solution that I ended up using: http://justinfx.com/2016/05/14/cpp-bindings-for-go/
The way I ended up solving this, for lack of a better solution, was to use private global maps on the Go side (ref). These maps would associate instances of the Go objects with a random uint64 id, and the id would be returned to C++ as an "opaque handle".
type frameSetMap struct {
lock *sync.RWMutex
m map[FrameSetId]*frameSetRef
rand idMaker
}
//...
func (m *frameSetMap) Add(fset fileseq.FrameSet) FrameSetId {
// fmt.Printf("frameset Add %v as %v\n", fset.String(), id)
m.lock.Lock()
id := FrameSetId(m.rand.Uint64())
m.m[id] = &frameSetRef{fset, 1}
m.lock.Unlock()
return id
}
Then I use reference counting to determine when C++ no longer needs the object, and remove it from the map:
// Go
func (m *frameSetMap) Incref(id FrameSetId) {
m.lock.RLock()
ref, ok := m.m[id]
m.lock.RUnlock()
if !ok {
return
}
atomic.AddUint32(&ref.refs, 1)
// fmt.Printf("Incref %v to %d\n", ref, refs)
}
func (m *frameSetMap) Decref(id FrameSetId) {
m.lock.RLock()
ref, ok := m.m[id]
m.lock.RUnlock()
if !ok {
return
}
refs := atomic.AddUint32(&ref.refs, ^uint32(0))
// fmt.Printf("Decref %v to %d\n", ref, refs)
if refs != 0 {
return
}
m.lock.Lock()
if atomic.LoadUint32(&ref.refs) == 0 {
// fmt.Printf("Deleting %v\n", ref)
delete(m.m, id)
}
m.lock.Unlock()
}
//C++
FileSequence::~FileSequence() {
if (m_valid) {
// std::cout << "FileSequence destroy " << m_id << std::endl;
m_valid = false;
internal::FileSequence_Decref(m_id);
m_id = 0;
m_fsetId = 0;
}
}
And all C++ interactions with the exported Go library communicate via the opaque handle:
// C++
size_t FileSequence::length() const {
return internal::FileSequence_Len(m_id);
}
Unfortunately it does mean that in a multhreaded C++ environment, all threads would go through a mutex to the map. But it is only a write lock when objects are created and destroyed, and for all method calls on an object it is a read lock.

Bad practice to call static function from external file via function pointer?

Consider the following code:
file_1.hpp:
typedef void (*func_ptr)(void);
func_ptr file1_get_function(void);
file1.cpp:
// file_1.cpp
#include "file_1.hpp"
static void some_func(void)
{
do_stuff();
}
func_ptr file1_get_function(void)
{
return some_func;
}
file2.cpp
#include "file1.hpp"
void file2_func(void)
{
func_ptr function_pointer_to_file1 = file1_get_function();
function_pointer_to_file1();
}
While I believe the above example is technically possible - to call a function with internal linkage only via a function pointer, is it bad practice to do so? Could there be some funky compiler optimizations that take place (auto inline, for instance) that would make this situation problematic?
There's no problem, this is fine. In fact , IMHO, it is a good practice which lets your function be called without polluting the space of externally visible symbols.
It would also be appropriate to use this technique in the context of a function lookup table, e.g. a calculator which passes in a string representing an operator name, and expects back a function pointer to the function for doing that operation.
The compiler/linker isn't allowed to make optimizations which break correct code and this is correct code.
Historical note: back in C89, externally visible symbols had to be unique on the first 6 characters; this was relaxed in C99 and also commonly by compiler extension.
In order for this to work, you have to expose some portion of it as external and that's the clue most compilers will need.
Is there a chance that there's a broken compiler out there that will make mincemeat of this strange practice because they didn't foresee someone doing it? I can't answer that.
I can only think of false reasons to want to do this though: Finger print hiding, which fails because you have to expose it in the function pointer decl, unless you are planning to cast your way around things, in which case the question is "how badly is this going to hurt".
The other reason would be facading callbacks - you have some super-sensitive static local function in module m and you now want to expose the functionality in another module for callback purposes, but you want to audit that so you want a facade:
static void voodoo_function() {
}
fnptr get_voodoo_function(const char* file, int line) {
// you tagged the question as C++, so C++ io it is.
std::cout << "requested voodoo function from " << file << ":" << line << "\n";
return voodoo_function;
}
...
// question tagged as c++, so I'm using c++ syntax
auto* fn = get_voodoo_function(__FILE__, __LINE__);
but that's not really helping much, you really want a wrapper around execution of the function.
At the end of the day, there is a much simpler way to expose a function pointer. Provide an accessor function.
static void voodoo_function() {}
void do_voodoo_function() {
// provide external access to voodoo
voodoo_function();
}
Because here you provide the compiler with an optimization opportunity - when you link, if you specify whole program optimization, it can detect that this is a facade that it can eliminate, because you let it worry about function pointers.
But is there a really compelling reason not just to remove the static from infront of voodoo_function other than not exposing the internal name for it? And if so, why is the internal name so precious that you would go to these lengths to hide that?
static void ban_account_if_user_is_ugly() {
...;
}
fnptr do_that_thing() {
ban_account_if_user_is_ugly();
}
vs
void do_that_thing() { // ban account if user is ugly
...
}
--- EDIT ---
Conversion. Your function pointer is int(*)(int) but your static function is unsigned int(*)(unsigned int) and you don't want to have to cast it.
Again: Just providing a facade function would solve the problem, and it will transform into a function pointer later. Converting it to a function pointer by hand can only be a stumbling block for the compiler's whole program optimization.
But if you're casting, lets consider this:
// v1
fnptr get_fn_ptr() {
// brute force cast because otherwise it's 'hassle'
return (fnptr)(static_fn);
}
int facade_fn(int i) {
auto ui = static_cast<unsigned int>(i);
auto result = static_fn(ui);
return static_cast<int>(result);
}
Ok unsigned to signed, not a big deal. And then someone comes along and changes what fnptr needs to be to void(int, float);. One of the above becomes a weird runtime crash and one becomes a compile error.