How to design an interface for a simple conversion routine? - c++

I have a simple regex conversion method that I use to do some minor processing to HTML passed in as a std::string. The method looks like:
std::string ParseQuotedPrintableHtml( std::string const& html );
I want to design this method into some sort of small library that can be used across my whole code base. Since it's just a single function, one might be tempted to just create a Utility class (or namespace) and stuff the function in there. I feel this is a bit of a naive design. Any suggestions on a good rule of thumb as to how to design functionality like this into a centralized and accessible location?
EDIT
I should also mention that there are several "helper" functions that this function calls (I also created these, and they are only useful to and used by this method). Ideally these would be "private" in a class, but if I keep this as a global function, those implementation methods will also be accessible in the global namespace (or whichever namespace I place them in).
I guess due to this, it's best to create a utility class maybe?
class QuotedPrintableHtml
{
private:
void HelperMethod1() const;
void HelperMethod2() const;
std::string html_;
public:
QuotedPrintableHtml( std::string const& html ) : html_(html) {}
std::string Parse() const;
};
Perhaps something like this?

I wouldn't advise creating a class: the utility functions don't share some state so I would just create a namespace like Utilities to collect those free functions. You can put all the helper functions you don't want to share in an anonymous namespace inside your cpp file.

Related

Should I convert a class with only methods to free functions in a namespace?

I originally created a class like so:
class A
{
public:
void run(int x);
private:
void run_helper1();
void run_helper2();
void run_helper3();
int a_;
double b_;
bool c_;
};
Later I realized it really didn't need any state, I just needed the functions. Would it make sense to drop the class and make these free functions in a namespace? If so, I lose the concept of public and private and end up with run_helper1(), run_helper2(), run_helper3() all being public, if I'm not mistaken. That seems like a poor design.
The main difference between class and namespace that a class is closed (but possibly extensible using inheritance) and holds an invariant; while a namespace is open and can be extended at any point. Invariant might be as simple as having a unique address (and being able to compare instance addresses), but, if I understand you correctly, even that is unnecessary.
If there's really no other use of A than to 'group together' functions, then your intuition might be right and you might want change it to a namespace.
There is, however, an example what a namespace can't do that a class can: there are no template namespaces. Thus, if you ever need to pass the methods together, e.g. as an API (or a versioned API), then you need to keep them as a class. In that case, callee templates over the whole collection of functions and you can have multiple such collections; but it's a rather rare use-case.
Thus, normally you can convert it.

What is the best way to represent a single global class instance?

I have a project, which has its own filesystem. The class basically looks like this:
class ResourceManager {
public:
std::string readFile(std::string const&);
private:
std::vector<std::string> root;
};
This class is used everywhere in the project. So it's global and have a single instance. My current solution is that I create an instance in the main function and then pass it to all of my classes (they store a reference). I'm absolutely fine with this approach, besides that i need to pass the instance to regular functions, but I'm curious if it's possible to archive a better result.
A simple way of making a filesystem is just to have static functions, because you don't really need any variables to store. But in my case I have a variable called root, which stores search directories so that I can do something like this:
rm.addResourceRoot(ResourceType::images, "path/to/directory");
I can have a static variable ResourceManager* instance and method ResourceManager& getRm() { return *instance; } inside a class, but that approach feels kind of weird and probably isn't a "best practice". Probably outside-of-class functions like this will help to make it look better, but I'm still hesitating.
std::string readFile(std::string const& path) {
return ResourceManager::getRm().readFile(path);
}
I can use singleton pattern, but everyone convinces me not to use it, because it's an anti-pattern and not good solution at all. This confuses me in some way so it would be great if anyone could explain the reasons to me.
In my opinion, the ideal access should look like this:
ResourceManager::readFile("path/to/file");
So what is the best way to represent such entities in a project?

Should I totally hide the internal class in my C++ header file when designing my SDK?

I am designing a SDK written in C++.
I have a question: could or should I totally hide the internal class in my public C++ header file?
The code snippets are like the following (in the header file MyPublicClass.h):
namespace PublicNamespace
{
namespace InternalNamespace
{
class MyInternalClass;
}
class MyPublicClass
{
public:
void SomeMemberFunc();
...
private:
std::shared_ptr<InternalNamespace::MyInternalClass> mImpl;
}
}
Per the C++ PImpl design pattern (and also many other materials from Google), it is OK to put the InternalNamespace::MyInternalClass into the public header.
My thought is: it looks unnecessary to let the external users know the internal namespace InternalNamespace, and also the class MyInternalClass. So I want to use void to replace the type InternalNamespace::MyInternalClass.
That's to say, for my case, I use std::shared_ptr<void> as the type of the data member mImpl, and in the .cpp file, use std::static_pointer_cast<InternalNamespace::MyInternalClass>(mImpl) to convert it to the actual class.
(Yeah, I know there is a little cost with this conversion but please ignore it).
Is this design correct or proper? Thanks all.
Don't use void or void * unless there is absolutely no alternative -- using void-pointers prevents the compiler from catching mistakes at compile-time, and leads to pain and suffering.
Using a clearly-labelled InternalNamespace should be good enough (assuming the programmers using your API aren't deliberately looking for trouble -- and if they are, there are plenty of other ways for them to find it anyway), although if you wanted to hide MyInternalClass entirely from calling code, you could instead make it an inner-class of MyPublicClass, i.e. something like this:
namespace PublicNamespace
{
class MyPublicClass
{
public:
void SomeMemberFunc();
...
private:
class MyInternalClass
{
[...]
};
std::shared_ptr<MyInternalClass> mImpl;
}
}
Since it's declared in the private section of MyPublicClass, no calling code outside of MyPublicClass would be able to access it at all.

Why is Microsoft using struct rather than class in new code?

So normally I wouldn't ask a question like this because it seems like it could be opinion based or start some sort of verbal war on coding practices, but I think there might be a technical reason here that I don't understand.
I was looking over the code in the header files for vcpkg (a library packing manager that Microsoft is creating and is "new" code) because reading code generally is a good way to learn things you didn't know.
The first thing I noticed was the use of using rather than typedef.
Snippet from 'https://github.com/microsoft/vcpkg/blob/master/toolsrc/include/vcpkg/parse.h'
template<class P>
using ParseExpected = ExpectedT<std::unique_ptr<P>, std::unique_ptr<ParseControlErrorInfo>>;
I haven't personally used using this way before and an answer from: What is the difference between 'typedef' and 'using' in C++11?. Essentially, using is the new way to do it, and the benefit is that it can use templates. So Microsoft had a good reason to use using instead of typedef.
Looking at 'https://github.com/microsoft/vcpkg/blob/master/toolsrc/include/vcpkg/commands.h' I noticed that they did not use any classes. Instead it was only namespaces with a function or so in them. ie:
namespace vcpkg::Commands
{
namespace BuildExternal
{
void perform_and_exit(const VcpkgCmdArguments& args, const VcpkgPaths& paths, const Triplet& default_triplet);
}
}
I'm guessing that part of this is that the calling syntax looks essentially just like a static member function in a class, so the code performs the same but maybe saves some overhead by being a namespace instead of a class. (If anyone has any ideas on this too that would be great.)
Now the main point of all this. Why is Microsoft using structs instead of classes in their namespaces?
Snippet from 'https://github.com/microsoft/vcpkg/blob/master/toolsrc/include/vcpkg/parse.h':
namespace vcpkg::Parse
{
/* ... Code I'm excluding for brevity ... */
struct ParagraphParser
{
ParagraphParser(RawParagraph&& fields) : fields(std::move(fields)) {}
void required_field(const std::string& fieldname, std::string& out);
std::string optional_field(const std::string& fieldname) const;
std::unique_ptr<ParseControlErrorInfo> error_info(const std::string& name) const;
private:
RawParagraph&& fields;
std::vector<std::string> missing_fields;
};
}
Searching stackoverflow, I found an old question: Why Microsoft uses a struct for directX library instead of a class?
Which the answers were essentially, you don't have to declare things as public as default and a comment way at the bottom saying that it was old code.
If vcpkg was old code I would be completely satisfied, however, this is new code. Is it just some style they have that is a carry over (but using vs typedef isn't)? Or is it to save a line of code (public:)? Or is there some sort of overhead benefit? Or some other thing I haven't considered at all?
The only differences between struct and class are:
the default member access (public vs private) and
the default inheritance if you inherit from the type (public inheritance vs private inheritance).
The end result of 1 will be the same once the author has finished adding public:/private: to the type. 2 you can easily control yourself by being explicit when you inherit, rather than rely on the default. It's hardly a big deal and doesn't really matter.
As to why Microsoft uses struct rather than class in their code, you will have to ask some Microsoft people.
Regarding the free functions vs static functions, I don't think there is any overhead in this with classes (I haven't measured this at all, I would just think that most compiler would recognize that the class is basically just a namespace for the function). The thing is just: You don't need a class.
Using a class with only static functions is basically abusing the class as a namespace. So if you are only doing that, then be explicit about it and just use a namespace. Having a class there would only be confusing since you would think that maybe there could be some state here and just see that there is non when you see that the function in the class is static.
This is especially relevant if this is used a bit wrongly. Imagine someone instantiates a class A a with static member function f to call a.f(). It is no problem regarding performance, since the construction is a no-op and it will pretty much be equivalent to A::f(). But for the reader it seems like there is some kind of state involved and that is just confusing.
Regarding the other two: using is just superior to typedef throught being able to use templates and is (IMO) better readable. The struct vs class issue is just something over what has the better defaults, its not a big difference, but most often, what you want is what a struct does, so there is no reason to use a class.
To be (more) compatible with C
To avoid making everything public by using the public: keyword, since that all COM objects for example have only public member functions.

Member functions for derived information in a class

While designing an interface for a class I normally get caught in two minds whether should I provide member functions which can be calculated / derived by using combinations of other member functions. For example:
class DocContainer
{
public:
Doc* getDoc(int index) const;
bool isDocSelected(Doc*) const;
int getDocCount() const;
//Should this method be here???
//This method returns the selected documents in the contrainer (in selectedDocs_out)
void getSelectedDocs(std::vector<Doc*>& selectedDocs_out) const;
};
Should I provide this as a class member function or probably a namespace where I can define this method? Which one is preferred?
In general, you should probably prefer free functions. Think about it from an OOP perspective.
If the function does not need access to any private members, then why should it be given access to them? That's not good for encapsulation. It means more code that may potentially fail when the internals of the class is modified.
It also limits the possible amount of code reuse.
If you wrote the function as something like this:
template <typename T>
bool getSelectedDocs(T& container, std::vector<Doc*>&);
Then the same implementation of getSelectedDocs will work for any class that exposes the required functions, not just your DocContainer.
Of course, if you don't like templates, an interface could be used, and then it'd still work for any class that implemented this interface.
On the other hand, if it is a member function, then it'll only work for this particular class (and possibly derived classes).
The C++ standard library follows the same approach. Consider std::find, for example, which is made a free function for this precise reason. It doesn't need to know the internals of the class it's searching in. It just needs some implementation that fulfills its requirements. Which means that the same find() implementation can work on any container, in the standard library or elsewhere.
Scott Meyers argues for the same thing.
If you don't like it cluttering up your main namespace, you can of course put it into a separate namespace with functionality for this particular class.
I think its fine to have getSelectedDocs as a member function. It's a perfectly reasonable operation for a DocContainer, so makes sense as a member. Member functions should be there to make the class useful. They don't need to satisfy some sort of minimality requirement.
One disadvantage to moving it outside the class is that people will have to look in two places when the try to figure out how to use a DocContainer: they need to look in the class and also in the utility namespace.
The STL has basically aimed for small interfaces, so in your case, if and only if getSelectedDocs can be implemented more efficiently than a combination of isDocSelected and getDoc it would be implemented as a member function.
This technique may not be applicable anywhere but it's a good rule of thumbs to prevent clutter in interfaces.
I agree with the answers from Konrad and jalf. Unless there is a significant benefit from having "getSelectedDocs" then it clutters the interface of DocContainer.
Adding this member triggers my smelly code sensor. DocContainer is obviously a container so why not use iterators to scan over individual documents?
class DocContainer
{
public:
iterator begin ();
iterator end ();
// ...
bool isDocSelected (Doc *) const;
};
Then, use a functor that creates the vector of documents as it needs to:
typedef std::vector <Doc*> DocVector;
class IsDocSelected {
public:
IsDocSelected (DocContainer const & docs, DocVector & results)
: docs (docs)
, results (results)
{}
void operator()(Doc & doc) const
{
if (docs.isDocSelected (&doc))
{
results.push_back (&doc);
}
}
private:
DocContainer const & docs;
DocVector & results;
};
void foo (DocContainer & docs)
{
DocVector results;
std :: for_each (docs.begin ()
, docs.end ()
, IsDocSelected (docs, results));
}
This is a bit more verbose (at least until we have lambdas), but an advantage to this kind of approach is that the specific type of filtering is not coupled with the DocContainer class. In the future, if you need a new list of documents that are "NotSelected" there is no need to change the interface to DocContainer, you just write a new "IsDocNotSelected" class.
The answer is proabably "it depends"...
If the class is part of a public interface to a library that will be used by many different callers then there's a good argument for providing a multitude of functionality to make it easy to use, including some duplication and/or crossover. However, if the class is only being used by a single upstream caller then it probably doesn't make sense to provide multiple ways to achieve the same thing. Remember that all the code in the interface has to be tested and documented, so there is always a cost to adding that one last bit of functionality.
I think this is perfectly valid if the method:
fits in the class responsibilities
is not too specific to a small part of the class clients (like at least 20%)
This is especially true if the method contains complex logic/computation that would be more expensive to maintain in many places than only in the class.