Returning a pointer in C++ [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 8 years ago.
Improve this question
I think my question sounds stupid and welcome downvote on me. If you are implementing a method in C++ which needs to return a pointer, is it safe to do that? If not, why?

Not a simple question. For instance: Best way of returning a pointer.
Ideally, you should try to avoid returning values that come with side-effects or obligations.
// This may be ok, it implies no burden on the user.
Manager* GetManager();
// But what if the user decides to call delete on the value you return?
// This is not unusual in C code, but carries a hidden contract:
// I allocate - you free.
const char* GetFilename(int fd)
{
char* filename = malloc(256);
sprintf(filename, "/tmp/tmpfile.%d", fd);
return filename;
}
C++ is about encapsulation and abstraction. You can codify the contract with your consumer by encapsulating a pointer you want to return. The idea here is that instead of exposing a pointer, you expose an object which is responsible for ownership of the pointer. Infact, recent versions of the language already do this for you with std::unique_ptr, std::shared_ptr and std::weak_ptr.
But a crude, simple RAII example might be:
class StrDupPtr
{
char* m_alloc;
public:
StrDupPtr(const char* src)
: m_alloc(strdup(src))
{}
~StrDupPtr()
{
free(m_alloc);
}
operator const char* () const { return m_alloc; }
// etc.
};
You're still returning a pointer here, but you've encapsulated it with a management contract and removed burden from the end-user to manage your resources.
You can't always avoid it, and when you have to, yes it can be dangerous.
int* AllocateMeSomeMemory()
{
int* memory = malloc(4 * sizeof(int));
// here, have four ints.
return memory;
}
int main() {
int* memory = AllocateMeSomeMemory();
memory[42] = 0xDeath; // yeah, it's not a valid hex number, but that's not really the problem.
}
Another common problem with pointers is that there's no way to tell how many people have them. Here's a contrived example:
void transferItem(userid_t user1, userid_t user2, itemid_t item) {
Account* a1 = GetAccount(user1);
Account* a2 = GetAccount(user2);
if (a1 != a2) {
transferItemInternal(a1, a2, item);
}
delete a2;
delete a1; // Sorry Dave, I can't do that. How about a nice game of CRASH?
}
Normally, a2 and a1 will be different, but when they're not...
Another common failure pattern with pointers is asynchronous callbacks:
// ask the database for user details, call OnLoginResult with userObj when we're done.
void login(int socket, userid_t userId, passwordhash_t pass) {
User* userObj = GetUserObj(userId, socket);
Query* query = Database()->NewQuery("SELECT * FROM user WHERE id = ? AND password = ?", userId, pass);
Database()->Queue(query, OnLoginResult, userObj);
}
void OnDisconnect(int socket, int reason) {
User* userObj = GetUserBySocket(socket);
if (userObj) {
UnregisterUserObj(userObj);
delete userObj;
}
}
void OnLoginResult(void* param) {
User* userObj = static_cast<UserObj*>(param);
// all well and good unless the user disconnected while waiting.
...
}

Yes it is. I assume you mean "Allocate and return" a pointer.
Its common to have initialisation functions which allocate a pointer to an object of some type, and then initialise the object itself. It will then be up to a different part of the program to release the memory.

Well it always depends on what you are doing. A pointer is simply a memory address, so it is similar to simply returning an integer. You should do more research on pointers and how to properly implement them

I sense this question might be closed quite soon, but I'll try to answer anyway.
Yes, it's "safe", as long as you're careful. In fact, it's a very common way to do things, particularly if you're interfacing with C APIs. Having said that, it's best to avoid doing so if you can, because C++ generally provides better alternatives.
Why should you avoid it? Firstly, let's say you have a method that looks like this:
MyStruct* get_data();
Is the return value a pointer to a single instance of MyStruct, or the start of an array? Do you need to free() the returned pointer? Or perhaps you need to use delete? Can the return value be NULL, and what happens if it is? Without looking at the documentation, you have no way of knowing any of these things. And the compiler has no way of knowing either, so it can't help you out in any way.
Better options:
If you want to return an array of values, use a std::array (if the size is fixed at compile-time), or a std::vector (if the size isn't known till run-time).
If you're trying to avoid copying a large struct, then return a reference, or a const reference if possible. That way the caller knows they won't receive a NULL value.
If you really need to return a pointer, than consider using a smart pointer instead -- that will help you sort out ownership issues. For example, std::shared_ptr uses reference counting, and std::unique_ptr ensures that a given pointer only ever has one owner.

Related

Destructor not being called with smart or raw pointer [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
Continuing from a previous question:
On this code, the destructors for Apple and Fruit don't get called at all. I have std::cerr statements in both and there's some clean up code in Apple that doesn't run. I thought calling delete was enough? Am I doing RAII correctly? I've also replaced the raw pointer with std::unique_ptr and the same result.
int32_t Fruit::frutificate(const Settings& settings) {
Fruit *listener;
if (settings.has_domain_socket()) {
listener = new Apple(settings);
} else {
listener = new Orange(settings);
}
int r = uv_run(listener->loop, UV_RUN_DEFAULT);
delete listener;
return r;
}
Update:
All classes have virtual destructors.
First, your immediate problem almost certainly that is that ~Fruit() is not virtual. Add that (virtual ~Fruit() = default or virtual ~Fruit() {} to class Fruit) and your code (as posted) will magically start to work.
However that is not what your code should be. Just working, well, not good enough.
There are a number of improvements you can make to your code. As a first improvement we'll use a unique_ptr: (as #Deduplicator mentioned above in comments)
int32_t Fruit::frutificate(const Settings& settings) {
std::unique_ptr<Fruit> listener;
if (settings.has_domain_socket()) {
listener.reset( new Apple(settings) );
} else {
listener.reset( new Orange(settings) );
}
int r = uv_run(listener->loop, UV_RUN_DEFAULT);
return r;
}
which uses RAII to ensure the lifetime of the listener is bounded. Much better, no more manual delete (which could be missed by accident, or exception).
In C++14, the .reset(new Blah(whatever)) can be replaced with = std::make_unique<Blah>(whatever);, and now your code never explicitly calls new and delete, which is a good habit to get into. However your code is tagged C++11, so I'll leave the C++11 version up above.
While that is better, we can do best. There is no need for using the free store (heap) at all.
A simple way to avoid the free store use is: (stolen from #Jarod in comments above)
int32_t Fruit::frutificate(const Settings& settings) {
if (settings.has_domain_socket()) {
return uv_run(Apple(settings).loop, UV_RUN_DEFAULT);
} else {
return uv_run(Orange(settings).loop, UV_RUN_DEFAULT);
}
}
which has the disadvantage of repeating the uv_run code (and can thus breed bugs). We can fix this with a lambda:
int32_t Fruit::frutificate(const Settings& settings) {
auto fruit_the_uv = [&](Fruit&& fruit) {
return uv_run(fruit.loop, UV_RUN_DEFAULT);
};
if (settings.has_domain_socket()) {
return fruit_the_uv( Apple(settings) );
} else {
return fruit_the_uv( Orange(settings) );
}
}
where we factor out the common code into a lambda, and then invoke it on the two branches. I used rvalue references as we are passing in temporary fruit.
Plus, fruit_the_uv reminds me of a 90s rap song whenever I read it. And that is a plus.

Several specific methods or one generic method?

this is my first question after long time checking on this marvelous webpage.
Probably my question is a little silly but I want to know others opinion about this. What is better, to create several specific methods or, on the other hand, only one generic method? Here is an example...
unsigned char *Method1(CommandTypeEnum command, ParamsCommand1Struct *params)
{
if(params == NULL) return NULL;
// Construct a string (command) with those specific params (params->element1, ...)
return buffer; // buffer is a member of the class
}
unsigned char *Method2(CommandTypeEnum command, ParamsCommand2Struct *params)
{
...
}
unsigned char *Method3(CommandTypeEnum command, ParamsCommand3Struct *params)
{
...
}
unsigned char *Method4(CommandTypeEnum command, ParamsCommand4Struct *params)
{
...
}
or
unsigned char *Method(CommandTypeEnum command, void *params)
{
switch(command)
{
case CMD_1:
{
if(params == NULL) return NULL;
ParamsCommand1Struct *value = (ParamsCommand1Struct *) params;
// Construct a string (command) with those specific params (params->element1, ...)
return buffer;
}
break;
// ...
default:
break;
}
}
The main thing I do not really like of the latter option is this,
ParamsCommand1Struct *value = (ParamsCommand1Struct *) params;
because "params" could not be a pointer to "ParamsCommand1Struct" but a pointer to "ParamsCommand2Struct" or someone else.
I really appreciate your opinions!
General Answer
In Writing Solid Code, Steve Macguire's advice is to prefer distinct functions (methods) for specific situations. The reason is that you can assert conditions that are relevant to the specific case, and you can more easily debug because you have more context.
An interesting example is the standard C run-time's functions for dynamic memory allocation. Most of it is redundant, as realloc can actually do (almost) everything you need. If you have realloc, you don't need malloc or free. But when you have such a general function, used for several different types of operations, it's hard to add useful assertions and it's harder to write unit tests, and it's harder to see what's happening when debugging. Macquire takes it a step farther and suggests that, not only should realloc just do _re_allocation, but it should probably be two distinct functions: one for growing a block and one for shrinking a block.
While I generally agree with his logic, sometimes there are practical advantages to having one general purpose method (often when operations is highly data-driven). So I usually decide on a case by case basis, with a bias toward creating very specific methods rather than overly general purpose ones.
Specific Answer
In your case, I think you need to find a way to factor out the common code from the specifics. The switch is often a signal that you should be using a small class hierarchy with virtual functions.
If you like the single method approach, then it probably should be just a dispatcher to the more specific methods. In other words, each of those cases in the switch statement simply call the appropriate Method1, Method2, etc. If you want the user to see only the general purpose method, then you can make the specific implementations private methods.
Generally, it's better to offer separate functions, because they by their prototype names and arguments communicate directly and visibly to the user that which is available; this also leads to more straightforward documentation.
The one time I use a multi-purpose function is for something like a query() function, where a number of minor query functions, rather than leading to a proliferation of functions, are bundled into one, with a generic input and output void pointer.
In general, think about what you're trying to communicate to the API user by the API prototypes themselves; a clear sense of what the API can do. He doesn't need excessive minutae; he does need to know the core functions which are the entire point of having the API in the first place.
First off, you need to decide which language you are using. Tagging the question with both C and C++ here makes no sense. I am assuming C++.
If you can create a generic function then of course that is preferable (why would you prefer multiple, redundant functions?) The question is; can you? However, you seem to be unaware of templates. We need to see what you have omitted here to tell if you if templates are suitable however:
// Construct a string (command) with those specific params (params->element1, ...)
In the general case, assuming templates are appropriate, all of that turns into:
template <typename T>
unsigned char *Method(CommandTypeEnum command, T *params) {
// more here
}
On a side note, how is buffer declared? Are you returning a pointer to dynamically allocated memory? Prefer RAII type objects and avoid dynamically allocating memory like that if so.
If you are using C++ then I would avoid using void* as you don't really need to. There is nothing wrong with having multiple methods. Note that you don't actually have to rename the function in your first set of examples - you can just overload a function using different parameters so that there is a separate function signature for each type. Ultimately, this kind of question is very subjective and there are a number of ways of doing things. Looking at your functions of the first type, you would perhaps be well served by looking into the use of templated functions
You could create a struct. That's what I use to handle console commands.
typedef int (* pFunPrintf)(const char*,...);
typedef void (CommandClass::*pKeyFunc)(char *,pFunPrintf);
struct KeyCommand
{
const char * cmd;
unsigned char cmdLen;
pKeyFunc pfun;
const char * Note;
long ID;
};
#define CMD_FORMAT(a) a,(sizeof(a)-1)
static KeyCommand Commands[]=
{
{CMD_FORMAT("one"), &CommandClass::CommandOne, "String Parameter",0},
{CMD_FORMAT("two"), &CommandClass::CommandTwo, "String Parameter",1},
{CMD_FORMAT("three"), &CommandClass::CommandThree, "String Parameter",2},
{CMD_FORMAT("four"), &CommandClass::CommandFour, "String Parameter",3},
};
#define AllCommands sizeof(Commands)/sizeof(KeyCommand)
And the Parser function
void CommandClass::ParseCmd( char* Argcommand )
{
unsigned int x;
for ( x=0;x<AllCommands;x++)
{
if(!memcmp(Commands[x].cmd,Argcommand,Commands[x].cmdLen ))
{
(this->*Commands[x].pfun)(&Argcommand[Commands[x].cmdLen],&::printf);
break;
}
}
if(x==AllCommands)
{
// Unknown command
}
}
I use a thread safe printf pPrintf, so ignore it.
I don't really know what you want to do, but in C++ you probably should derive multiple classes from a Formatter Base class like this:
class Formatter
{
virtual void Format(unsigned char* buffer, Command command) const = 0;
};
class YourClass
{
public:
void Method(Command command, const Formatter& formatter)
{
formatter.Format(buffer, command);
}
private:
unsigned char* buffer_;
};
int main()
{
//
Params1Formatter formatter(/*...*/);
YourClass yourObject;
yourObject.Method(CommandA, formatter);
// ...
}
This removes the resposibility to handle all that params stuff from your class and makes it closed for changes. If there will be new commands or parameters during further development you don't have to modifiy (and eventually break) existing code but add new classes that implement the new stuff.
While not full answer this should guide you in correct direction: ONE FUNCTION ONE RESPONSIBILITY. Prefer the code where it is responsible for one thing only and does it well. The code whith huge switch statement (which is not bad by itself) where you need cast void * to some other type is a smell.
By the way I hope you do realise that according to standard you can only cast from void * to <type> * only when the original cast was exactly from <type> * to void *.

c++, msxml and smart pointers

I need parse some XML and wrote some helpers. I am not expert in C++, actually I wrote with c more then seven years ago. So, I would to make sure, is the approach, what i use correct or not :)
1) I implemented some simple helpers, to take care about exceptions.
For example:
CComPtr<IXMLDOMElement> create_element(CComPtr<IXMLDOMDocument> xml_doc, string element_name) {
CComPtr<IXMLDOMElement> element;
HRESULT hr = xml_doc->createElement((BSTR)element_name.c_str(), &element);
if (FAILED(hr))
hr_raise("Failed to create XML element '" + element_name + "'", hr);
return element;
}
and use it like this:
void SomeClass::SomeMethod() {
CComPtr<IXMLDOMElement> element = xmlh::create_element(xml_doc, "test");
//..
// save xml to file
}
is it ok? I mean can i return smart pointer as function result? Is this approach free from leaks?
2) Also i use some smartpointer as Class Members.
Like this:
class XMLCommand {
public:
XMLCommand(std::string str_xml);
~XMLCommand(void);
protected:
CComPtr<IXMLDOMDocument> xml_doc;
}
XMLCommand::XMLCommand(string str_xml) {
xml_doc = xmlh::create_xml_doc();
}
// some methods below uses xml_doc
The question is the same, is it correct and free from leaks?
Thanks.
That will work fine. When returning a smart pointer from function, the result is stored before the temporaries are destructed, so as long as you store it in a CComPtr<IXMLDOMElement> when you call create_element you will get the desired results (e.g., CComPtr<IXMLDOMElement> resElem = create_element(...);. Optimized C++ will very likely even not bother with temporaries and such and just use resElem instead of element inside your create_element() method, speeding up the process (google Return Value Optimization for details).
The latter case is pretty much textbook smart-pointer usage. I can't think of a case that will fail. One danger when using smart pointers in general though is to be aware of and/or avoid circular dependencies, which can cause smart pointers to never delete their contained object.

Conditional boost::shared_ptr initialization?

In an effort to write better code, I've committed to using boost's smart pointers.
Should a boost::shared_ptr always be initialized like so?
boost::shared_ptr<TypeX> px(new TypeX);
My confusion has risen from a code block similar to this:
void MyClass::FillVector()
{
boost::shared_ptr<TypeX> tempX;
if(//Condition a)
{
boost::shared_ptr<TypeX> intermediateX(new TypeA);
tempX = intermediateX;
}
else
{
boost::shared_ptr<TypeX> intermediateX(new TypeB);
tempX = intermediateX;
}
_typeXVec.push_back(tempX); // member std::vector< boost::shared_ptr<TypeX> >
}
Is there an accepted way to skip that intermediate shared_ptr while keeping it in scope to push back to _typeXVec?
Thank you.
Edit: Wanted to clarify that TypeA and TypeB are both children of TypeX.
boost::shared_ptr<X> p;
if( condition ) {
p.reset( new A() );
}else {
p.reset( new B() );
}
Should a boost::shared_ptr always be initialized like so?
Yes, per the Boost shared_ptr Best Practices.
This is not the only safe way to have a smart pointer take ownership of a dynamically allocated object, but it is a common pattern that can be used practically everywhere. It's easy for anyone who is familiar with this pattern to look at the code and know that it is correct.
(So, for example, I'm pretty sure that tempX.reset(new TypeA); would also be safe, but I'm not 100% sure. I'd have to check the documentation, and I'd probably want to check the implementation. Then I'd have to think through all of the possible failure cases and see that all of them are handled correctly. If you follow the pattern, you and the people who maintain the code later don't have to waste time thinking through these issues.)
One note, however: rather than the assignment, however, I would recommend either of the following:
using std::swap;
swap(tempX, intermediateX);
tempX = std::move(intermediateX);
These remove the need for the unnecessary atomic reference count update required by assignment.

Lazy object creation in C++, or how to do zero-cost validation

I've stumbled across this great post about validating parameters in C#, and now I wonder how to implement something similar in C++. The main thing I like about this stuff is that is does not cost anything until the first validation fails, as the Begin() function returns null, and the other functions check for this.
Obviously, I can achieve something similar in C++ using Validate* v = 0; IsNotNull(v, ...).IsInRange(v, ...) and have each of them pass on the v pointer, plus return a proxy object for which I duplicate all functions.
Now I wonder whether there is a similar way to achieve this without temporary objects, until the first validation fails. Though I'd guess that allocating something like a std::vector on the stack should be for free (is this actually true? I'd suspect an empty vector does no allocations on the heap, right?)
Other than the fact that C++ does not have extension methods (which prevents being able to add in new validations as easily) it should be too hard.
class Validation
{
vector<string> *errors;
void AddError(const string &error)
{
if (errors == NULL) errors = new vector<string>();
errors->push_back(error);
}
public:
Validation() : errors(NULL) {}
~Validation() { delete errors; }
const Validation &operator=(const Validation &rhs)
{
if (errors == NULL && rhs.errors == NULL) return *this;
if (rhs.errors == NULL)
{
delete errors;
errors = NULL;
return *this;
}
vector<string> *temp = new vector<string>(*rhs.errors);
std::swap(temp, errors);
}
void Check()
{
if (errors)
throw exception();
}
template <typename T>
Validation &IsNotNull(T *value)
{
if (value == NULL) AddError("Cannot be null!");
return *this;
}
template <typename T, typename S>
Validation &IsLessThan(T valueToCheck, S maxValue)
{
if (valueToCheck < maxValue) AddError("Value is too big!");
return *this;
}
// etc..
};
class Validate
{
public:
static Validation Begin() { return Validation(); }
};
Use..
Validate::Begin().IsNotNull(somePointer).IsLessThan(4, 30).Check();
Can't say much to the rest of the question, but I did want to point out this:
Though I'd guess that allocating
something like a std::vector on the
stack should be for free (is this
actually true? I'd suspect an empty
vector does no allocations on the
heap, right?)
No. You still have to allocate any other variables in the vector (such as storage for length) and I believe that it's up to the implementation if they pre-allocate any room for vector elements upon construction. Either way, you are allocating SOMETHING, and while it may not be much allocation is never "free", regardless of taking place on the stack or heap.
That being said, I would imagine that the time taken to do such things will be so minimal that it will only really matter if you are doing it many many times over in quick succession.
I recommend to get a look into Boost.Exception, which provides basically the same functionality (adding arbitrary detailed exception-information to a single exception-object).
Of course you'll need to write some utility methods so you can get the interface you want. But beware: Dereferencing a null-pointer in C++ results in undefined behavior, and null-references must not even exist. So you cannot return a null-pointer in a way as your linked example uses null-references in C# extension methods.
For the zero-cost thing: A simple stack-allocation is quite cheap, and a boost::exception object does not do any heap-allocation itself, but only if you attach any error_info<> objects to it. So it is not exactly zero cost, but nearly as cheap as it can get (one vtable-ptr for the exception-object, plus sizeof(intrusive_ptr<>)).
Therefore this should be the last part where one tries to optimize further...
Re the linked article: Apparently, the overhaead of creating objects in C# is so great that function calls are free in comparison.
I'd personally propose a syntax like
Validate().ISNOTNULL(src).ISNOTNULL(dst);
Validate() contructs a temporary object which is basically just a std::list of problems. Empty lists are quite cheap (no nodes, size=0). ~Validate will throw if the list is not empty. If profiling shows even this is too expensive, then you just change the std::list to a hand-rolled list. Remember, a pointer is an object too. You're not saving an object just by sticking to the unfortunate syntax of a raw pointer. Conversely, the overhead of wrapping a raw pointer with a nice syntax is purely a compile-time price.
PS. ISNOTNULL(x) would be a #define for IsNotNull(x,#x) - similar to how assert() prints out the failed condition, without having to repeat it.