Below I have a bit of code for implementing Huffman compression.
What I was curious about was if I could initialize the left and right pointers without including cstdlib, or rather, if I could initialize an empty memory location for storing the left and right without using malloc.
Also, in my combine function, I do not want to use "NULL" for the string parent node of my left and right, but would rather have an empty string. Will I have to make a new constructor for this? I get an error (basic_string::_M_construct null not valid) when I replace "NULL" with nullptr.
#include <string>
#include <cstdlib>
#ifndef PRIORITY_NODE
#define PRIORITY_NODE
namespace Huffman
{
class PriorityNode
{
private:
std::string key; // The character sequence to compress
long frequency = 0; // The frequency of the character sequence
PriorityNode* left = (PriorityNode*)malloc(sizeof(PriorityNode));
PriorityNode* right = (PriorityNode*)malloc(sizeof(PriorityNode));
public:
PriorityNode(std::string k, long f): frequency(f), key(k){};
std::string getKey() const{ return key;}
long getFrequency() const{ return frequency;}
void setLeft(const PriorityNode& left){*this->left = left;}
void setRight(const PriorityNode& right){*this->right = right;}
PriorityNode& getLeft() const{ return *left;}
PriorityNode& getRight() const{ return *right;}
friend PriorityNode combine(const PriorityNode& lhs, const PriorityNode& rhs)
{
long ret_freq = lhs.getFrequency() + rhs.getFrequency();
PriorityNode ret = PriorityNode("NULL", ret_freq);
ret.setLeft(lhs);
ret.setRight(rhs);
return ret;
};
};
}
#endif
So a couple of points.
key is a string, not a pointer. It makes no sense to set it to nullptr, i am guessing "NULL", is just a stand in for when the key has no value. Instead of this, just use an empty string "".
You should try to avoid this manual memory management for several reasons. First of all you have no destructor, so your memory that you malloc is never freeed meaning you have a memory leak. Secondly, you allocate memory for the sub nodes, even if you do not need to. I would suggest something more like the following:
class PriorityNode {
private:
...
std::shared_ptr<PriorityNode> left, right; // Default constructs to nullptr
...
friend PriorityNode combine(const std::shared_ptr<PriorityNode> lhs,
const std::shared_ptr<PriorityNode> rhs) {
PriorityNode ret = PriorityNode("", ret_freq);
ret.setLeft(lhs);
ret.setRight(rhs);
return ret;
};
The empty string is "". You can pass that to the existing constructor. std::string doesn't distinguish empty strings and NULL pointers like C strings. It is not allowed to initialize an std::string with a NULL pointer. If you need a separate null state, use std::optional<std::string>.
Related
Couldn't find the answer in any similar-named question.
I want a user to be able to initialize a string member at any point in the lifetime of an object, not necessarily on construction, but I want them to know that the object is invalid until the string is initialized...
When creating a simple class, say the following:
#include <string>
class my_class {
public:
my_class() : _my_str() { }
my_class(const std::string & str) : my_class() {
set_my_str(str);
}
std::string get_my_str() const {
return _my_str;
}
void set_my_str(const std::string & str) {
_my_str = str;
}
private:
std::string _my_str;
};
and a user creates an empty instance of the class (i.e. using the empty constructor), _my_str will be an empty/uninitialized string?
So, I see two ways of handling behavior: the way mentioned above, where an empty string is returned, or a possible second way:
#include <string>
class my_class {
public:
my_class() : _my_str(), _my_str_ptr(nullptr) { }
my_class(const std::string & str) : my_class() {
set_my_str(str);
}
std::string * get_my_str() const {
return _my_str_ptr;
}
void set_my_str(const std::string & str) {
_my_str = str;
_my_str_ptr = &_my_str;
}
private:
std::string _my_str;
std::string * _my_str_ptr;
};
Where you return a nullptr, and you maintain a pointer to a local variable?
Is that valid behavior? Which way is preferred and why? Wouldn't the second way be better since you are telling the user, "listen, this object is currently invalid, you need to initialize it" while still implying that you are managing the lifetime of such object.
_my_str will be an empty/uninitialized string?
Empty, yes. Uninitialized, no. It's completely initialized (to an empty string).
Where you return a nullptr, and you maintain a pointer to a local variable?
Is that valid behavior?
Yes it's valid, but
Which way is preferred and why? Wouldn't the second way be better since you are telling the user, "listen, this object is currently invalid, you need to initialize it" while still implying that you are managing the lifetime of such object.
It makes absolutely no sense to maintain two distinct member variables for this. It sounds like what you need is std::optional (or the equivalent in Boost, boost::optional), so that _my_str has two states: empty/invalid (contains no string) and non-empty/valid (contains a string):
#include <string>
#include <experimental/optional>
using std::experimental::optional;
class my_class {
public:
my_class() /* default-initializes _my_str as empty */ { }
my_class(const std::string & str) : _my_str(str) { }
const std::string * get_my_str() const {
if (_my_str) // if it exists
return &*_my_str; // return the string inside the optional
else
return nullptr; // if the optional is empty, return null
}
/* Or simply this, if you don't mind exposing a bit of the
implementation details of the class:
const optional<std::string> & get_my_str() const {
return _my_str;
}
*/
void set_my_str(const std::string & str) {
_my_str = str;
}
private:
optional<std::string> _my_str;
};
If "" (an empty string) can be used as a sentinel value to signify the "empty/invalid" state in your case, then you can just do this:
#include <string>
class my_class {
public:
my_class() /* default-initializes _my_str as "" */ { }
my_class(const std::string & str) : _my_str(str) { }
const std::string * get_my_str() const {
if (!_my_str.empty()) // if it'a non-empty
return &_my_str; // return the non-empty string
else
return nullptr; // if it's empty, return null
}
void set_my_str(const std::string & str) {
_my_str = str;
}
private:
std::string _my_str;
};
In general, the pattern you're referring to is called Null object pattern.
The "oldest way" of implementing it was using one of possible values of a variable and reserving it for "no value" meaning. In case of a string an empty string commonly was used in such a way. Obviously not always possible, when all values were needed.
The "old way", was always using a pointer - (const T* get_t() const). This way the whole range of variable values could be meaningful, and still "no value" semantics were available by means of returning a null pointer. This was better, but still pointers are not as comfortable to use, not safe. Nowadays, pointers are usually bad engineering.
The modern way is optional<T> (or boost::optional<T>).
An empty std::string value is not per definition invalid. It is just empty.
On important difference is that the second "get_..." approach does not copy the object but gives the user a non const pointer to the internal string which leads to violation of const correctness since you imply that the class may not be changed by having const at the get method while still providing a pointer that may change the internal state.
If your logic implies that "empty string" == "invalid" and if this is a possible state there is not much of a difference whether the user must do
if (get_my_str())) // use valid pointer to nonempty string versus
if(!get_my_str().empty()) // use valid nonempty string
I think.
You'd want to return std::string const & from your get method and leave it to the user wether to copy the object or not.
4.1. No forced copy (versus by value return std::string)
4.2. No pointer which may be nullptr and accidentally dereferenced.
4.3. Passing around and storing a pointer which may outlive the object is more common that dangling references.
I want a user to be able to initialize the string later on, not necessarily on construction, but I want them to be able to know that the object is invalid until the string is initialized...
The question is: Is an empty string actually a "valid" value after proper initialization?
If yes: use optional to add one additional state signaling validity.
If no: let the emptyness of the string stand for invalidity of your object.
just looking in optimizing some std::map code. The map contains objects, accessed via the string-identifier.
Example:
std::map<std::string, CVeryImportantObject> theMap;
...
theMap["second"] = new CVeryImportantObject();
Now, when using the find-function as theMap->find("second"), the String is converted into std::string("second"), which causes new string allocations (over all when using IDL=2 with Visual Studio).
1. Is there a possibility to use a string-only class to avoid such allocations?
Intentionally I've tried to use another String-Class as well:
std::map<CString, CVeryImportantObject> theMap;
This code works also. But CString indeed is an object.
And: If you remove an object from the map, I'll need to release both the related object and the key, do I?
Any suggestions?
Now, when using the find-function as theMap->find("second"), the
String is converted into std::string("second"), which causes new
string allocations (over all when using IDL=2 with Visual Studio).
This is a Standard issue, which is fixed in C++14 for ordered containers. The newest version of VS, VS 14 CTP (which is a pre-release) contains a fix for this issue, as will new versions of other implementations.
If you need to avoid allocations, you can try a class like llvm::StringRef which can refer to std::string or string literals interchangably, but then you will be left trying to handle the ownership externally.
You can try something like unique_ptr<char[], maybe_delete> that sometimes deletes the contents. This is a bit of a mess to interface with though.
And: If you remove an object from the map, I'll need to release both
the related object and the key, do I?
The map will automatically destruct the key and value for you. For a class which frees it's own resources like std::string, which is the only sane way to write C++, then you can erase without worrying about resource cleanup.
If you always use string constants as keys, you can use const char * as key type in map when you use proper comparator:
struct PCharCompare {
bool operator()( const char *s1, const char *s2 ) const { return strcmp( s1, s2 ) < 0; }
};
std::map< const char *, CVeryImportantObject, PCharCompare> theMap;
Note: you have to be careful and need to understand how it works, as it can easily lead to UB:
void foo() {
char buffer[256];
snprintf( buffer, sizeof( buffer ), "blah" );
theMap.insert( std::make_pair( buffer, Object ) );
} // ups dangled pointer in the map
As for optimization, it is very unlikely that std::string creation is a culprit. you may try to use std::unordered_map or something similar for optimization
Now, when using the find-function as theMap->find("second"), the
String is converted into std::string("second"), which causes new
string allocations
Not necessarily. VC uses Small-String Optimisation (SSO). This means that for a string as short as "second", no allocation on the heap should take place at all; the characters will instead be stored directly in the temporarily created std::string object.
This is still not free (because the std::string has to be created, albeit without any dynamic allocation happening inside), but should be good enough. Is it really a concern for you? Chances are very high that it does not cause any measurable performance decrease.
Is there a possibility to use a string-only class to avoid such allocations?
Not really, except of the C++14 fix mentioned in other answers. Using char const * as the key type is very dangerous, because std::map will only store the actual addresses, not copies of the keys.
If I were you and if I really experienced performance problems, I'd just not use std::map directly but create my own container class to wrap a std::map<char const *, T, CustomComparison> and do the hard pointer work inside.
template <class ValueType>
class FastStringMap
{
private:
struct Comparison
{
bool operator()(char const *lhs, char const *rhs) const
{
return strcmp(lhs, rhs) > 0;
}
};
typedef std::map<char const *, ValueType, Comparison> WrappedMap;
WrappedMap m_map;
public:
typedef typename WrappedMap::iterator iterator;
typedef typename WrappedMap::const_iterator const_iterator;
bool insert(char const *key, ValueType const &value)
{
if (m_map.find(key) != m_map.end())
{
return false;
}
else
{
char *copy = new char[strlen(key) + 1];
strcpy(copy, key);
try
{
return m_map.insert(std::make_pair(copy, value)).second;
}
catch (...)
{
delete copy;
throw;
}
}
}
~FastStringMap()
{
for (iterator iter = m_map.begin(); iter != m_map.end(); ++iter)
{
delete[] iter->first;
}
}
iterator find(char const *key)
{
return m_map.find(key);
}
const_iterator find(char const *key) const
{
return m_map.find(key);
}
// further operations
};
To be used like this:
FastStringMap<int> m;
m.insert("AAA", 1);
m.insert("BBB", 2);
m.insert("CCC", 3);
std::cout << m.find("AAA")->second;
Note that you can possibly make this more sophisticated by templatising also on the character type (for std::wstring support) or by providing "real" iterator classes (using Boost Iterator Facade).
And: If you remove an object from the map, I'll need to release both
the related object and the key, do I?
If you use std::string, no. If you use char const * and if the pointers point to memory allocated dynamically (as in my example), then yes.
Hi I have a test tomarrow and can't figure out why subtraction is made on the pointer before checking if the refcount is 0. I've been searching on google but still cant figure it out. So I'm hoping turning to you guys :) would help.
Easyiest is too just show you the code, I've marked the lines with comments, so here it is:
This is the class StringRep that has pointers to it for counting pointerref to it,
struct StringRep{
int size; // amount of chars incl. EOL \0-tecken
char* chars; // Pointer to char
int refCount; // Amount of String-variables
};
And this is class String that uses the StringRep,
class String{
public:
String(char* str);
String(const String& other);
~String();
const String& operator=(const String& rhs);
char get(int index) const { return srep->chars[index]; }
void put(char ch, int index);
private:
StringRep* srep;
};
String::String(const String& other):srep(other.srep){
srep->refCount++;
}
String::~String(){
if (--srep->refCount == 0){ //why --srep here?
delete [] srep->chars;
delete srep;
}
}
const String& String::operator=(const String& rhs){
if (srep != rhs.srep){
if (--srep->refCount == 0){ //why --srep here?
delete [] srep->chars;
delete srep;
}
srep = rhs.srep;
srep->refCount++;
}
return *this;
}
void String::put(char ch, int index){
if (srep->refCount > 1){ //Why not --srep here?
StringRep* tmpRep = new StringRep;
tmpRep->refCount = 1;
tmpRep->size = srep->size;
tmpRep->chars = new char[tmpRep->size];
std::strcpy(tmpRep->chars, srep->chars);
--srep->refCount;
srep = tmpRep;
}
srep->chars[index] = ch;
}
This is all info I have on the example question for the test, I know that --spek points to the object before spek, but cant figure out the logic behing checking if what is pointed at before now is 0 then its okey to delete, or to copy, but why? As I said I've searched the webb and have found some answers to help me understand the functions of the pointer and the subtraction etc, it more the logic that is confusing.
Best regards
Because of operator precendence, --srep->refCount is not decrementing srep, but the refCount member.
So, the code is decrementing the refCount, and if it comes down to 0, it can assume that the last reference to the object is being destroyed.
--srep->refCount
is parsed as
--(srep->refCount)
because prefix decrement has lower priority than -> (however, postfix decrement has the same priority as ->). Always use parens in your own code!
I'm trying to re-learn C++ and was wondering if anyone could help me out here. I'm trying to implement my own String class to see if I can remember things, but I'm stuck on the constructor.
I have my header file and want to have a constructor as so:
Header File (MyFiles\String.h):
#ifndef STRING_
#define STRING_
using namespace std;
#include <iostream>
class String
{
private:
static const unsigned int MAX = 32; // Capacity of string
char Mem[MAX]; // Memory to hold characters in string
unsigned Len; // Number of characters in string
public:
// Construct empty string
//
String()
{
Len = 0;
}
// Reset string to empty
//
void reset()
{
Len = 0;
}
// Return status information
//
bool empty() const
{
return Len == 0;
}
unsigned length() const
{
return Len;
}
// Return reference to element I
//
char& operator[]( unsigned I )
{
return Mem[I];
}
// Return constant reference to element I
//
const char& operator[]( unsigned I ) const
{
return Mem[I];
}
// Construct string by copying existing string
//
String( const String& );
// Construct string by copying array of characters
//
String( const char [] );
// Copy string to the current string
//
String& operator=( const String& );
// Append string to the current string
//
String& operator+=( const String& );
};
// Compare two strings
//
bool operator==( const String&, const String& );
bool operator!=( const String&, const String& );
// Put a string into an output stream
//
ostream& operator<<( ostream&, const String& );
#endif
The bit I'm stuck on is this:
String::String(const String& str)
{
//what goes here?
}
Thanks!
Well, since it's a learning exercise.
I think you want to copy the contents of the other string here since this is a copy constructor. So you will want to copy across all the member variables. In your case
the copy constructor is not necessary because you've got a static array. If you had
dynamic memory (i.e. used new to allocate pointer to Mem) then you'd need this. However,
to show you how it's done, here you go.
String::String(const String& str)
{
//what goes here?
assert(str.Len < MAX); // Hope this doesn't happen.
memcpy(Mem, str.Mem, str.Len);
Len = str.Len;
}
You need to copy the data from str to this. The length is easy:
Len = str.Len; // or, equiv. this->Len= str.Len
The data is a little harder. You might use strcpy or memcpy, or even a for loop.
memcpy(Mem, str.Mem, sizeof Mem);
Good luck!
I concur with Kornel Kisielewicz: the fewer hand-rolled String classes, the better. But you're only doing this to learn, so fair enough :-). Anyway: your copy constructor needs to copy over the length and the contents of the Mem array, and that's it.
(If you were doing this to make something useful rather than as a learning exercise, I'd add: a string class with a fixed maximum string length -- especially one as small as 32 characters -- is a very bad idea indeed. But it's entirely reasonable if you don't feel like dealing with memory allocation and deallocation at the same time as you're trying to remember the even-more-basics...)
This is a homework assignment. The Field container was the assignment from a week ago, and now I'm supposed to use the Field container to act as a dynamic array for a struct NumPair which holds two char * like so:
struct NumPair
{
char *pFirst, *pSecond;
int count;
NumPair( char *pfirst = "", char *psecond = "", int count = 0)
: pFirst(strdup(pfirst)), pSecond(strdup(psecond)), count(count)
{ }
NumPair( const NumPair& np )
: count(np.count), pFirst(strdup(np.pFirst)), pSecond(strdup(np.pSecond))
{ }
NumPair& operator=( const NumPair& np )
{
if(this != &np)
{
pFirst = strdup(np.pFirst);
pSecond = strdup(np.pSecond);
count = np.count;
}
return *this;
}
and the Field container
Field<NumPair> dict_;
The homework requires the use of char *, and not string, so that we can get better with all this low-level stuff. I've already had some question about char to wchar_t conversions, etc.
Now I have a question as to whether or not I'm destructing the NumPair properly. The scenario is as follows:
1) Field destructor gets called
template <class T>
Field<T>::~Field()
{
delete[] v_;
}
2) Delete calls the destructor of every element NumPair in v_;
~NumPair()
{
free(pFirst);
free(pSecond);
}
Is this okay? I haven't really read too many articles about mixing and matching elements created on the heap and free-store as we wish. I figure as long as I don't use delete on an improper malloc'ed element, I should be fine.
However, I don't know the entire intricacies of the delete command, so I'm wondering whether or not this is valid design, and what I could do to make it better.
Also, of course this isn't. I'm getting an error of the type:
This may be due to a corruption of the heap and points to dbgheap
extern "C" _CRTIMP int __cdecl _CrtIsValidHeapPointer(
const void * pUserData
)
{
if (!pUserData)
return FALSE;
if (!_CrtIsValidPointer(pHdr(pUserData), sizeof(_CrtMemBlockHeader), FALSE))
return FALSE;
return HeapValidate( _crtheap, 0, pHdr(pUserData) ); // Here
}
Again, how could I improve this without the use of string?
FIELD CTOR/Copy Ctor/Assignment
template <class T>
Field<T>::Field()
: v_(0), vused_(0), vsize_(0)
{ }
template <class T>
Field<T>::Field(size_t n, const T &val)
: v_(0), vused_(n), vsize_(0)
{
if(n > 0)
{
vsize_ = 1;
while(vsize_ < n)
vsize_ <<= 1;
v_ = new T[vsize_];
std::fill(v_, (v_ + vused_), val);
}
}
template <class T>
Field<T>::Field(const Field<T> &other)
: v_(new T[other.vsize_]), vsize_(other.vsize_), vused_(other.vused_)
{
std::copy(other.v_, (other.v_ + other.vused_), v_);
}
template <class T>
Field<T>& Field<T>::operator =(const Field<T> &other)
{
this->v_ = other.v_;
this->vused_ = other.vused_;
this->vsize_ = other.vsize_;
return *this;
}
FIELD MEMBERS
T *v_;
size_t vsize_;
size_t vused_;
Your copy constructor (of Field<>) seems OK, but the operator= is problematic.
Not only does it leak memory (what happens to the original v_?), but after that, two instances of Field<> hold a pointer to the same block of memory, and the one that is destructed first will invalidate the others v_ - and you can't even tell whether that has happened.
It's not always easy to decide how to deal with operator= - some think that implicit move semantics are okay, but the rest of us see how that played out with the majority of people, with std::auto_ptr. Probably the easiest solution is to disable copying altogether, and use explicit functions for moving ownership.
Your string handling in NumPair looks ok (strdup + free) and your Field container delete[] looks okay but it's hard to say because you don't show what v_ is.
eq mentions in a comment that you should also beware of how you are copying NumPairs. By default, C++ will give you an implicit member-wise copy constructor. This is where a RAII type like std::string makes your life easier: Your std::string containing struct can be copied without any special handling on your part and memory referenced in the string will be taken care of by the string's copy. If you duplicate your NumPair (by assigning it or returning it from a function for example) then the destruction of the temporary will free your strings out from under you.
Your copy constructor for Field just copies the pointers in v_. If you have two copies of a Field, all of the NumPairs in v_ will be deleted when the first Field goes out of scope, and then deleted again when the second one does.