Storing lots of messages in C++ namespace

Storing lots of messages in C++ namespace - c++

I have a little program that basically prompt questions to user, read the answer, check its validity, and so on, until the creation of an RPG character is fully performed. I rapidly got several classes, each of them related to a bunch of messages to store.
My question is : what is the best way of storing these messages ?
My first (and dirty) try was to have a single file messages.hpp :
namespace M
{
string const message1("what's your name ?");
// + many others
}
but as many classes depended on this file, any tiny change to it implied very long compilation time.
So in order to redistribute the messages, I wanted to include the relevant messages as static members of the classes that needed them, but I found it spoiled a bit readability of my class header. So I was wandering : can I create a subnamespace in a class that will be a message collector like this :
// monster.h
class Monster
{
// members and methods declared here
};
namespace Monster::Messages
{
string const message1("what's your name ?");
// + others messages only used by Monster
}
and could possibly be used like this :
// monster.cpp
Monster::Monster()
{
cout << Messages::message1 << endl;
}
But then, I wondered :
will all the messages of Monster::messages be copied for each instance of the class Monster ?
will all the messages of Monster::messages be copied in all files including monster.hpp (static linkage) ?
Are the messages considered members, static members, or not members at all in Monster ?
What happen if I use the static keyword in Monster::messages, will
it have the signification of static linkage or static member ?
And of course : is it bad practice ? Is there a clever way ?

In my opinion the best practice is to store all those data information in external data file of your game.
I'm going to provide you some benefits of doing so:
Code more clear and light.
Little modifications on data game can be achieved even without compile the binary. That is very important for those data parameters which have to be tested and each time you change the value you have to run the game to see all side effects. (For example some parameters about monster health, damage, GUI position, etc...). In this way the test phase does not require to compile the binary again and it's faster.
Allow other game developers to easily modify game parameters. For example for those persons in the team who are not programmers.
Allow easily to modify game data without change the design in the code. For example let's suppose after two years your game becomes to be very popular and you want to translate it in several languages in order to get more users. In case all those data are embedded in the code, you should re-design and modify your code and, maybe the thing more important, the person who can do that work is only you because only you have the access and the knowledge of the code. In the other case (when the data are stored in external files), more person can easily access and upgrade your game.
How do you can achieve this procedure could be another question on SO, but there are several choices.
For example you can create a unique huge file (that implies some smart tricks, especially for performance reasons) or you can distribute the data parameters in several files.
So your game will have a phase of loading in which the data are read from the disk (as almost all the other games).
Moreover you should think (if you need it) a method in order to crypt those files and avoid game-users can modify them.

Related

How could you recreate UE4 serialization of user-defined types?

The problem
The Unreal Engine 4 Editor allows you to add objects of your own types to the scene.
Doing so requires minimal work from the user - to make a class visible in the editor you only need to add some macros, like UCLASS()
UCLASS()
class MyInputComponent: public UInputComponent //you can instantiate it in the editor!
{
UPROPERTY(EditAnywhere)
bool IsSomethingEnabled;
};
This is enough to allow the editor to serialize the created-in-editor object's data (remember: the class is user-defined but the user doesn't have to hardcode loading specific fields. Also note that the UPROPERTY variable can be of user-defined type as well). It is then deserialized while loading the actual game. So how is it handled so painlessly?
My attempt - hardcoded loading for every new class
class Component //abstract class
{
public:
virtual void LoadFromStream(std::stringstream& str) = 0;
//virtual void SaveIntoStream(std::stringstream& str) = 0;
};
class UserCreatedComponent: public Component
{
std::string Name;
int SomeInteger;
vec3 SomeVector; //example of user-defined type
public:
virtual void LoadFromStream(std::stringstream& str) override //you have to write a function like this every time you create a new class
{
str >> Name >> SomeInteger >> SomeVector.x >> SomeVector.y >> SomeVector.z;
}
};
std::vector<Component*> ComponentsFromStream(std::stringstream& str)
{
std::vector<Component*> components;
std::string type;
while (str >> type)
{
if (type == "UserCreatedComponent") //do this for every user-defined type...
components.push_back(new UserCreatedComponent);
else
continue;
components.back()->LoadFromStream(str);
}
return components;
}
Example of an UserCreatedComponent object stream representation:
UserCreatedComponent MyComponent 5 0.707 0.707 0.707
The engine user has to do these things every time he creates a new class:
1. Modify ComponentsFromStream by adding another if
2. Add two methods, one which loads from stream and another which saves to stream.
We want to simplify it so the user only has to use a macro like UPROPERTY.
Our goal is to free the user from all this work and create a more extensible solution, like UE4's (described above).
Attempt at simplifying 1: Using type-int mapping
This section is based on the following: https://stackoverflow.com/a/17409442/12703830
The idea is that for every new class we map an integer, so when we create an object we can just pass the integer given in the stream to the factory.
Example of an UserCreatedComponent object stream representation:
1 MyComponent 5 0.707 0.707 0.707
This solves the problem of working out the type of created object but also seems to create two new problems:
How should we map classes to integers? What would happen if we include two libraries containing classes that map themselves to the same number?
What will initializing e.g. components that need vectors for construction look like? We don't always use strings and ints for object construction (and streams give us pretty much only that).

So how is it handled so painlessly?
C++ language does not provide features which would allow to implement such simple de/serialization of class instances as it works in the Unreal Engine. There are various ways how to workaround the language limitations, the Unreal uses a code generator.
The general idea is following:
When you start project compilation, a code generator is executed.
The code generator parses your header files and searches for macros which has special meaning, like UCLASS, USTRUCT, UENUM, UPROPERTY, etc.
Based on collected data, it generates not only code for de/serialization, but also for other purposes, like reflection (ability to iterate certain members), information about inheritance, etc.
After that, your code is finally compiled along with the generated code.
Note: this is also why you have to include "MyClass.generated.h" in all header files which declare UCLASS, USTRUCT and similar.
In other words, someone must write the de/serialization code in some form. The Unreal solution is that the author of such code is an application.
If you want to implement such system yourself, be aware that it's lots of work. I'm no expert in this field, so I'll just provide general information:
The primary idea of code-generators is to automatize repetitive work, nothing more - in other words, there's no other special magic. That means that "how objects are de/serialized" (how they're transformed from memory to file) and "how the code which de/serializes is created" (whether it's written by a person or generated by an application) are two separate topics.
First, it should be established how objects are de/serialized. For example, std::stringstream can be used, or objects can be de/serialized from/to generally known formats like XML, json, bson, yaml, etc., or a custom solution can be defined.
Establish what's the source of data for generated de/serialization code. In case of Unreal Engine, it's user code itself. But it's not the only way - for example Protobuffers use a simple language which is used only to define data structure and the generator creates code which you can include and use.
If the source of data should be C++ code itself, do not write you own C++ parser! (The only exceptions to this rule are: educational purpose or if you want to spend rest of your life with working on the parser.) Luckily, there are projects which you can use - for example there's clang AST.
How should we map classes to integers? What would happen if we include two libraries containing classes that map themselves to the same number?
There's one fundamental problem with mapping classes to integers: it's not possible to uniquely map every possible class name to an integer.
Proof: create classes named Foo_[integer] and map it to the [integer], i.e. Foo_0 -> 0, Foo_1 -> 1, Foo_2 -> 2, etc. After you use biggest integer value, how do you map Bar_0?
You can start assigning the numbers sequentially as they're added to a project, but as you correctly pin-pointed, what if you include new library? You could start counting from some big number, like 1.000.000, but how do you determine what should be first number for each library? It doesn't have a clear solution.
Some of solutions to this problem are:
Define clear subset of classes which can be de/serialized and assign sequential integers to these classes. The subset can be, for example, "only classes in my project, no library support".
Identify classes with two integers - one for class, one for library. This means you have to have some central register which assigns library integers uniquely (e.g. in order they're registered).
Use string which uniquely identifies the class including library name. This is what Unreal uses.
Generate a hash from class and library name. There's risk of hash collision - the better hash you use, the lower risk there is. For example git (the version control application) uses SHA-1 (which is considered unsafe today) to identify it's objects (files, directories, commits) and the program is used worldwide without bigger issues.
Generate UUID, a 128-bit random number (with special rules). There's also risk of collision, but it's generally considered highly improbable. Used by Java and Unity the game engine.
What would happen if we include two libraries containing classes that map themselves to the same number?
That's called a collision. How it's handled depends on design of de/serialization code, there are mainly two approaches to this problem:
Detect that. For example if your class identifier contains library identifier, don't allow loading/registering library with ID which is already identified. In case of ID which doesn't include library ID (e.g. hash/UUID variant), don't allow registering such classes. Throw an exception or exit the application.
Assume there's no collision. If actual collision happens, it's so-called UB, an undefined behaviour. The application will probably crash or act weirdly. It might corrupt stored data.
What will initializing e.g. components that need vectors for construction look like? We don't always use strings and ints for object construction (and streams give us pretty much only that).
This depends on what it's required from de/serializing code.
The simplest solution is actually to use string of values separated by space.
For example, let's define following structure:
struct Person
{
std::string Name;
float Age;
};
A vector of Person instances could look like: 3 Adam 22.2 Bob 34.5 Cecil 19.0 (i.e. first serialize number of items (vector size), then individual items).
However, what if you add, remove or rename a member? The serialized data would become unreadable. If you want more robust solution, it might be better to use more structured data, for example YAML:
persons:
- name: Adam
age: 22.2
- name: Bob
age: 34.5
- name: Cecil
age: 19.0
Final notes
The problem of de/serializing objects (in C++) is actually big, various systems uses various solutions. That's why this answer is so generic and it doesn't provide exact code - there's not single silver bullet. Every solution has it's advantages and disadvantages. Even detailed description of just Unreal Engine's serialization system would become a book.
So this answer assumes that reader is able to search for various mentioned topic, like yaml file format, Protobuffers, UUID, etc.
Every mentioned solution to a sub-problem has lots of it's own problems which weren't explored. For example de/serialization of string with spaces or new lines from/to simple string stream. If it's needed to solve such problems, it's recommended to first search for more specialized questions or write one if there's nothing to be found.
Also, C++ is constantly evolving. For example, better support for reflection is added, which might, one day, provide enough features to implement high-quality de/serializer. However, if it should be done in compile-time, it would heavily depend on templates which slow down compilation process significantly and decrease code readibility. That's why code generators might be still considered a better choice.

is allowing direct access to class member variables from outside the class good practice?

Given the following class:
class ToggleOutput {
public:
uint32_t count;
ToggleOutput(PARAMETERS) //I've just removed stuff to reduce the code
{
// The code when setting things up
}
void Update() // public method to toggle a state
{
// this method will check if a time period has elapsed
// if the time period has elapsed, toggle an output
// Each time the output is toggled on then count gets incremented
count += 1;
}
};
Later on in the code, several instances of ToggleOutput get created
ToggleOutput outPut_1(PARAMETERS); // Again, PARAMETERS are just the stuff
ToggleOutput outPut_2(PARAMETERS); // I've cut out for brevity.
ToggleOutput outPut_3(PARAMETERS);
ToggleOutput outPut_4(PARAMETERS);
during execution, I want to do stuff, based on the value of the class member variable, count. eg
if (outPut_1.count >= SOMEVALUE)
do_some_stuff();
I have been told that this is not acceptable. To follow the 'tenets of OOP', class methods should be impletmented to interact with class variables from outside of the class, eg the above code would need to become
if (outPut1.getCount() >= SOMEVALUE)
and the class variable count would need to be made private.
Is this true? Or is it acceptable to allow direct access to class variables if required

Or is it acceptable to allow direct access to class variables if required
A lot of research into good software engineering and programmer productivity indicates that it's typically good to hide the details of how something is implemented. If person A writes a class, then s/he has certain assumptions about how the class should work. If person B wants to use the class, then s/he often has different assumptions about how the class should work (especially if person A did not document the code well, or even at all, as is the case all too often). Then person B is likely to misuse the data in the class, which can break how the class methods work, and lead to errors that are difficult to debug, at least for person B.
In addition, by hiding the details of the class implementation, person A has the freedom to complete rework the implementation, perhaps removing the variable count and replacing it with something else. This can occur because person A figures out a better way to implement count, or because count was in there only as a debugging tool and is not necessary to the actual working of ToggleOutput, etc.
Programmers don't write code only for themselves. In general, they write code for other people, that will be maintained for other people. "Other people" includes you five years from now, when you look at how you implemented something and ask yourself, What on earth was I thinking? By keeping the details of the implementation hidden (including data) you have the freedom to change that, and client classes/software don't need to worry about it as long as the interface remains the same.

Basically, member access is a rule you impose to the developers.
It's something you put in place to prevent yourself or another developer using your class from modifying properties that are supposed to be managed only by the class itself and nobody else.
It has nothing to do with security (well, not necessarily anyway), it's more a matter of semantics. If it's not supposed to be modified externally, it should be private.
And why should you care? Well, it helps you keep your code coherent and organized, which is specially important if you are working with a development team or with code that you intent to distribute.
And if you have to document your class, you only have to do so for stuff that is public, as far as the class user is concerned nothing else matters.

C++ member variable change listeners (100+ classes)

I am trying to make an architecture for a MMO game and I can't figure out how I can store as many variables as I need in GameObjects without having a lot of calls to send them on a wire at the same time I update them.
What I have now is:
Game::ChangePosition(Vector3 newPos) {
gameobject.ChangePosition(newPos);
SendOnWireNEWPOSITION(gameobject.id, newPos);
}
It makes the code rubbish, hard to maintain, understand, extend. So think of a Champion example:
I would have to make a lot of functions for each variable. And this is just the generalisation for this Champion, I might have have 1-2 other member variable for each Champion type/"class".
It would be perfect if I would be able to have OnPropertyChange from .NET or something similar. The architecture I am trying to guess would work nicely is if I had something similar to:
For HP: when I update it, automatically call SendFloatOnWire("HP", hp);
For Position: when I update it, automatically call SendVector3OnWire("Position", Position)
For Name: when I update it, automatically call SendSOnWire("Name", Name);
What are exactly SendFloatOnWire, SendVector3OnWire, SendSOnWire ? Functions that serialize those types in a char buffer.
OR METHOD 2 (Preffered), but might be expensive
Update Hp, Position normally and then every Network Thread tick scan all GameObject instances on the server for the changed variables and send those.
How would that be implemented on a high scale game server and what are my options? Any useful book for such cases?
Would macros turn out to be useful? I think I was explosed to some source code of something similar and I think it used macros.
Thank you in advance.
EDIT: I think I've found a solution, but I don't know how robust it actually is. I am going to have a go at it and see where I stand afterwards. https://developer.valvesoftware.com/wiki/Networking_Entities

On method 1:
Such an approach could be relatively "easy" to implement using a maps, that are accessed via getters/setters. The general idea would be something like:
class GameCharacter {
map<string, int> myints;
// same for doubles, floats, strings
public:
GameCharacter() {
myints["HP"]=100;
myints["FP"]=50;
}
int getInt(string fld) { return myints[fld]; };
void setInt(string fld, int val) { myints[fld]=val; sendIntOnWire(fld,val); }
};
Online demo
If you prefer to keep the properties in your class, you'd go for a map to pointers or member pointers instead of values. At construction you'd then initialize the map with the relevant pointers. If you decide to change the member variable you should however always go via the setter.
You could even go further and abstract your Champion by making it just a collection of properties and behaviors, that would be accessed via the map. This component architecture is exposed by Mike McShaffry in Game Coding Complete (a must read book for any game developer). There's a community site for the book with some source code to download. You may have a look at the actor.h and actor.cpp file. Nevertheless, I really recommend to read the full explanations in the book.
The advantage of componentization is that you could embed your network forwarding logic in the base class of all properties: this could simplify your code by an order of magnitude.
On method 2:
I think the base idea is perfectly suitable, except that a complete analysis (or worse, transmission) of all objects would be an overkill.
A nice alternative would be have a marker that is set when a change is done and is reset when the change is transmitted. If you transmit marked objects (and perhaps only marked properties of those), you would minimize workload of your synchronization thread, and reduce network overhead by pooling transmission of several changes affecting the same object.

Overall conclusion I arrived at: Having another call after I update the position, is not that bad. It is a line of code longer, but it is better for different motives:
It is explicit. You know exactly what's happening.
You don't slow down the code by making all kinds of hacks to get it working.
You don't use extra memory.
Methods I've tried:
Having maps for each type, as suggest by #Christophe. The major drawback of it was that it wasn't error prone. You could've had HP and Hp declared in the same map and it could've added another layer of problems and frustrations, such as declaring maps for each type and then preceding every variable with the mapname.
Using something SIMILAR to valve's engine: It created a separate class for each networking variable you wanted. Then, it used a template to wrap up the basic types you declared (int, float, bool) and also extended operators for that template. It used way too much memory and extra calls for basic functionality.
Using a data mapper that added pointers for each variable in the constructor, and then sent them with an offset. I left the project prematurely when I realised the code started to be confusing and hard to maintain.
Using a struct that is sent every time something changes, manually. This is easily done by using protobuf. Extending structs is also easy.
Every tick, generate a new struct with the data for the classes and send it. This keeps very important stuff always up to date, but eats a lot of bandwidth.
Use reflection with help from boost. It wasn't a great solution.
After all, I went with using a mix of 4, and 5. And now I am implementing it in my game. One huge advantage of protobuf is the capability of generating structs from a .proto file, while also offering serialisation for the struct for you. It is blazingly fast.
For those special named variables that appear in subclasses, I have another struct made. Alternatively, with help from protobuf I could have an array of properties that are as simple as: ENUM_KEY_BYTE VALUE. Where ENUM_KEY_BYTE is just a byte that references a enum to properties such as IS_FLYING, IS_UP, IS_POISONED, and VALUE is a string.
The most important thing I've learned from this is to have as much serialization as possible. It is better to use more CPU on both ends than to have more Input&Output.
If anyone has any questions, comment and I will do my best helping you out.
ioanb7

Friendship not inherited - what are the alternatives?

I have written/am writing a piece of physics analysis code, initially for myself, that will now hopefully be used and extended by a small group of physicists. None of us are C++ gurus. I have put together a small framework that abstracts the "physics event" data into objects acted on by a chain of tools that can easily be swapped in and out depending on the analysis requirements.
This has created two halves to the code: the "physics analysis" code that manipulates the event objects and produces our results via derivatives of a base "Tool"; and the "structural" code that attaches input files, splits the job into parallel runs, links tools into a chain according to some script, etc.
The problem is this: for others to make use of the code it is essential that every user should be able to follow every single step that modifies the event data in any way. The (many) extra lines of difficult structural code could therefore be daunting, unless it is obviously and demonstrably peripheral to the physics. Worse, looking at it in too much detail might give people ideas - and I'd rather they didn't edit the structural code without very good reason - and most importantly they must not introduce anything that affects the physics.
I would like to be able to:
A) demonstrate in an obvious way that
the structural code does not edit the
event data in any way
B) enforce this once other users
begin extending the code themselves
(none of us are
expert, and the physics always comes
first - translation: anything not
bolted down is fair game for a nasty
hack)
In my ideal scenario the event data would be private, with the derived physics tools inheriting access from the Tool base class. Of course in reality this is not allowed. I hear there are good reasons for this, but that's not the issue.
Unfortunately, in this case the method of calling getters/setters from the base (which is a friend) would create more problems than it solves - the code should be as clean, as easy to follow, and as connected to the physics as possible in the implementation of the tool itself (a user should not need to be an expert in either C++ or the inner workings of the program to create a tool).
Given that I have a trusted base class and any derivatives will be subject to close scrutiny, is there any other roundabout but well tested way of allowing access to only these derivatives? Or any way of denying access to the derivatives of some other base?
To clarify the situation I have something like
class Event
{
// The event data (particle collections etc)
};
class Tool
{
public:
virtual bool apply(Event* ev) = 0;
};
class ExampleTool : public Tool
{
public:
bool apply(Event* ev)
{
// do something like loop over the electron collection
// and throw away those will low energy
}
};
The ideal would be to limit access to the contents of Event to only these tools for the two reasons (A and B) above.
Thanks everyone for the solutions proposed. I think, as I suspected, the perfect solution I was wishing for is impossible. dribeas' solution would be perfect in any other setting, but its precisely in the apply() function that the code needs to be as clear and succinct as possible as we will basically spend all day writing/editing apply() functions, and will also need to understand every line of these written by each of the others. Its not so much about capability as readability and effort. I do like the preprocessor solution from "Useless". It doesn't really enforce the separation, but someone would need to be genuinely malicious to break it. To those who suggested a library, I think this will definitely be a good first step, but doesn't really address the two main issues (as I'll still need to provide the source anyway).

There are three access qualifiers in C++: public, protected and private. The sentence with the derived physics tools inheriting access from the Tool base class seems to indicate that you want protected access, but it is not clear whether the actual data that is private is in Tool (and thus protected suffices) or is currently private in a class that befriends Tool.
In the first case, just make the data protected:
class Tool {
protected:
type data;
};
In the second case, you can try to play nasty tricks on the language, like for example, providing an accessor at the Tool level:
class Data {
type this_is_private;
friend class Tool;
};
class Tool {
protected:
static type& gain_acces_to_data( Data& d ) {
return d.this_is_private;
}
};
class OneTool : public Tool {
public:
void foo( Data& d ) {
operate_on( gain_access_to_data(d) );
}
};
But I would avoid it altogether. There is a point where access specifiers stop making sense. They are tools to avoid mistakes, not to police your co-workers, and the fact is that as long as you want them to write code that will need access to that data (Tool extensions) you might as well forget about having absolute protection: you cannot.
A user that wants to gain access to the data might as well just use the newly created backdoor to do so:
struct Evil : Tool {
static type& break_rule( Data & d ) {
return gain_access_to_data( d );
}
};
And now everyone can simply use Evil as a door to Data. I recommend that you read the C++FAQ-lite for more insight on C++.

Provide the code as a library with headers to be used by whoever wants to create tools. This nicely encapsulates the stuff you want to keep intact. It's impossible to prevent hacks if everyone has access to the source and are keen to make changes to anything.

There is also the C-style approach, of restricting visibility rather than access rights. It is enforced more by convention and (to some extent) your build system, rather than the language - although you could use a sort of include guard to prevent "accidental" leakage of the Tool implementation details into the structural code.
-- ToolInterface.hpp --
class Event; // just forward declare it
class ToolStructuralInterface
{
// only what the structural code needs to invoke tools
virtual void invoke(std::list<Event*> &) = 0;
};
-- ToolImplementation.hpp --
class Event
{
// only the tool code sees this header
};
// if you really want to prevent accidental inclusion in the structural code
#define TOOL_PRIVATE_VISIBILITY
-- StructuralImplementation.hpp --
...
#ifdef TOOL_PRIVATE_VISIBILITY
#error "someone leaked tool implementation details into the structural code"
#endif
...
Note that this kind of partitioning lends itself to putting the tool and structural code in seperate libraries - you might even be able to restrict access to the structural code seperately to the tool code, and just share headers and the compiled library.

I've done a shady thing

Are (seemingly) shady things ever acceptable for practical reasons?
First, a bit of background on my code. I'm writing the graphics module of my 2D game. My module contains more than two classes, but I'll only mention two in here: Font and GraphicsRenderer.
Font provides an interface through which to load (and release) files and nothing much more. In my Font header I don't want any implementation details to leak, and that includes the data types of the third-party library I'm using. The way I prevent the third-party lib from being visible in the header is through an incomplete type (I understand this is standard practice):
class Font
{
private:
struct FontData;
boost::shared_ptr<FontData> data_;
};
GraphicsRenderer is the (read: singleton) device that initializes and finalizes the third-party graphics library and also is used to render graphical objects (such as Fonts, Images, etc). The reason it's a singleton is because, as I've said, the class initializes the third-party library automatically; it does this when the singleton object is created and exits the library when the singleton is destroyed.
Anyway, in order for GR to be able to render Font it must obviously have access to its FontData object. One option would be to have a public getter, but that would expose the implementation of Font (no other class other than Font and GR should care about FontData). Instead I considered it's better to make GR a friend of Font.
Note: Until now I've done two things that some may consider shady (singleton and friend), but these are not the things I want to ask you about. Nevertheless, if you think my rationale for making GR a singleton and a friend of Font is wrong please do criticize me and maybe offer better solutions.
The shady thing. So GR has access to Font::data_ though friendship, but how does it know exactly what a FontData is (since it's not defined in the header, it's an incomplete type)? I'll just show the code and the comment that includes the rationale...
// =============================================================================
// graphics/font.cpp
// -----------------------------------------------------------------------------
struct Font::FontData
: public sf::Font
{
// Just a synonym of sf::Font
};
// A redefinition of FontData exists in GraphicsRenderer::printText(),
// which will have to be modified as well if this definition is modified.
// (The redefinition is called FontDataSurogate.)
// Why not have FontData defined only once in a separate header:
// If the definition of FontData changes, most likely printText() text will
// have to be altered also regardless. Considering that and also that FontData
// has (and should have) a very simple definition, a separate header was
// considered too much of an overhead and of little practical advantage.
// =============================================================================
// graphics/graphics_renderer.cpp
// -----------------------------------------------------------------------------
void GraphicsRenderer::printText(const Font& fnt /* ... */)
{
struct FontDataSurogate
: public sf::Font {
};
FontDataSurogate* suro = (FontDataSurogate*)fnt.data_.get();
sf::Font& font = (sf::Font)(*suro);
// ...
}
So that's the shady thing I'm trying to do. Basically what I want is a review of my rationale, so please tell me if you think I've done something horrendous or if not confirm my rationale so I can be a bit surer I'm doing the right thing. :) (This is my biggest project yet and I'm only at the beginning so I'm kinda feeling things in the dark atm.)

In general, if something looks sketchy, I've found that it's often worth going back a few times and trying to figure out exactly why that's necessary. In most cases, some kind of fix pops up (maybe not as "nice", but not relying on any kind of trick).
Now, the first issue I see in your example is this bit of code:
struct FontDataSurogate
: public sf::Font {
};
occurs twice, in different files (neither being a header). That may come back and be a bother when you change one but not the other in the future, and making sure both are identical will very likely be a pain.
To solve that, I would suggest putting the definition to FontDataSurogate and the appropriate includes (whatever library/header defines sf::Font) in a separate header. From the two files that need to use FontDataSurogate, include that definition header (not from any other code files or headers, just those two).
If you have a main class declaration header for your library, place the forward declaration for the class there, and use pointers in your objects and parameters (regular pointers or shared pointers).
You can then use friend or add a get method to retrieve the data, but by moving the class definition to its own header, you've created a single copy of that code and have a single object/file that's interfacing with the other library.
Edit:
You commented on the question while I was writing this, so I'll add on a reply to your comment.
"Too much overhead" - more to document, one more thing to include, the complexity of the code grows, etc.
Not so. You will have one copy of the code, compared to the two that must remain identical now. The code exists either way, so it needs documented, but your complexity and particularly maintenance is simplified. You do gain two #include statements, but is that such a high cost?
"Little practical advantage" - printText() would have to be modified every time FontData is modified regardless of whether or not it's defined in a separate header or not.
The advantage is less duplicate code, making it easier to maintain for you (and others). Modifying the function when the input data changes is not surprising or unusual really. Moving it to another header doesn't cost you anything but the mentioned includes.

friend is fine, and encouraged. See C++ FAQ Lite's rationale for more info: Do friends violate encapsulation?
This line is indeed horrendous, as it invokes undefined behavior: FontDataSurogate* suro = (FontDataSurogate*)fnt.data_.get();

You forward declare the existence of the FontData struct, and then go on to fully declare it in two locations: Font, and GraphicsRenderer. Ew. Now you have to manually keep these exactly binary compatible.
I'm sure it works, but you're right, it is kindof shady. But whenever we say such-and-such is eeevil, we mean to avoid a certain practice, with the caveat that sometimes it can be useful. That being said, I don't think this is one of those times.
One technique is to invert your handling. Instead of putting all of the logic inside GraphicsRenderer, put some of it inside Font. Like so:
class Font
{
public:
void do_something_with_fontdata(GraphicsRenderer& gr);
private:
struct FontData;
boost::shared_ptr<FontData> data_;
};
void GraphicsRenderer::printText(const Font& fnt /* ... */)
{
fnt.do_something_with_fontdata(*this);
}
This way, the Font details are kept within the Font class, and even GraphicsRenderer doesn't need to know the specifics of the implementation. This solves the friend issue too (although I don't think friend is all that bad to use).
Depending on how your code is laid out, and what it's doing, attempting to invert it like this may be quite difficult. If that is the case, simply move the real declaration of FontData to its own header file, and use it in both Font and GraphicsRenderer.

You've spent a lot more effort asking this question then you've supposedly saved by duplicating that code.
You state three reasons you didn't want to add the file:
Extra include
Extra Documentation
Extra Complexity
But I would have to say that 2 and 3 are increased by duplicating that code. Now you document what its doing in the original place and what the fried monkey its doing defined again in another random place in the code base. And duplicating code can only increase the complexity of a project.
The only thing you are saving is an include file. But files are cheap. You should not be afraid of creating them. There is almost zero cost (or at least there should be) to add a new header file.
The advantages of doing this properly:
The compiler doesn't have to make the definition you give compatible
Someday, somebody is going to modify the FontData class without modifying PrintText(), maybe they should modify PrintText(), but they either haven't done it yet or don't know that they need to. Or perhaps in a way that simply hasn't occoured to additional data on FontData make sense. Regardless, the different pieces of code will operate on different assumptions and will explode in a very hard to trace bug.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js