C++ strings without <string> and STL

C++ strings without <string> and STL - c++

I've not used C++ very much in the past, and have recently been doing a lot of C#, and I'm really struggling to get back into the basics of C++ again. This is particularly tricky as work mandates that none of the most handy C++ constructs can be used, so all strings must be char *'s, and there is no provision for STL lists.
What I'm currently trying to do is to create a list of strings, something which would take me no time at all using STL or in C#. Basically I want to have a function such as:
char **registeredNames = new char*[numberOfNames];
Then,
RegisterName(const * char const name, const int length)
{
//loop to see if name already registered snipped
if(notFound)
{
registeredNames[lastIndex++] = name;
}
}
or, if it was C#...
if(!registeredNames.Contains(name))
{
registeredNames.Add(name);
}
and I realize that it doesn't work. I know the const nature of the passed variables (a const pointer and a const string) makes it rather difficult, but my basic problem is that I've always avoided this situation in the past by using STL lists etc. so I've never had to work around it!

There are legitimate reasons that STL might be avoided. When working in fixed environments where memory or speed is a premium, it's sometimes difficult to tell what is going on under the hood with STL. Yes, you can write your own memory allocators, and yes, speed generally isn't a problem, but there are differences between STL implementations across platforms, and those differences mighe be subtle and potentially buggy. Memory is perhaps my biggest concern when thinking about using it.
Memory is precious, and how we use it needs to be tightly controlled. Unless you've been down this road, this concept might not make sense, but it's true. We do allow for STL usage in tools (outside of game code), but it's prohibited inside of the actual game. One other related problem is code size. I am slightly unsure of how much STL can contribute to executable size, but we've seen marked increases in code size when using STL. Even if your executable is "only" 2M bigger, that's 2M less RAM for something else for your game.
STL is nice for sure. But it can be abused by programmers who don't know what they are doing. It's not intentional, but it can provide nasty surprises when you don't want to see them (again, memory bloat and performance issues)
I'm sure that you are close with your solution.
for ( i = 0; i < lastIndex; i++ ) {
if ( !strcmp(&registeredNames[i], name ) {
break; // name was found
}
}
if ( i == lastIndex ) {
// name was not found in the registeredNames list
registeredNames[lastIndex++] = strdup(name);
}
You might not want to use strdup. That's simply an example of how to to store the name given your example. You might want to make sure that you either don't want to allocate space for the new name yourself, or use some other memory construct that might already be available in your app.
And please, don't write a string class. I have held up string classes as perhaps the worst example of how not to re-engineer a basic C construct in C++. Yes, the string class can hide lots of nifty details from you, but it's memory usage patterns are terrible, and those don't fit well into a console (i.e. ps3 or 360, etc) environment. About 8 years ago we did the same time. 200000+ memory allocations before we hit the main menu. Memory was terribly fragmented and we couldn't get the rest of the game to fit in the fixed environment. We wound up ripping it out.
Class design is great for some things, but this isn't one of them. This is an opinion, but it's based on real world experience.

You'll probably need to use strcmp to see if the string is already stored:
for (int index=0; index<=lastIndex; index++)
{
if (strcmp(registeredNames[index], name) == 0)
{
return; // Already registered
}
}
Then if you really need to store a copy of the string, then you'll need to allocate a buffer and copy the characters over.
char* nameCopy = malloc(length+1);
strcpy(nameCopy, name);
registeredNames[lastIndex++] = nameCopy;
You didn't mention whether your input is NULL terminated - if not, then extra care is needed, and strcmp/strcpy won't be suitable.

If portability is an issue, you may want to check out STLport.

Why can't you use the STL?
Anyway, I would suggest that you implement a simple string class and list templates of your own. That way you can use the same techniques as you normally would and keep the pointer and memory management confined to those classes. If you mimic the STL, it would be even better.

If you really can't use stl (and I regret believing that was true when I was in the games industry) then can you not create your own string class? The most basic of string class would allocate memory on construction and assignment, and handle the delete in the destructor. Later you could add further functionality as you need it. Totally portable, and very easy to write and unit test.

Working with char* requires you to work with C functions. In your case, what you really need is to copy the strings around. To help you, you have the strndup function. Then you'll have to write something like:
void RegisterName(const char* name)
{
// loop to see if name already registered snipped
if(notFound)
{
registerNames[lastIndex++] = stdndup(name, MAX_STRING_LENGTH);
}
}
This code suppose your array is big enough.
Of course, the very best would be to properly implement your own string and array and list, ... or to convince your boss the STL is not evil anymore !

Edit: I guess I misunderstood your question. There is no constness problem in this code I'm aware of.
I'm doing this from my head but it should be about right:
static int lastIndex = 0;
static char **registeredNames = new char*[numberOfNames];
void RegisterName(const * char const name)
{
bool found = false;
//loop to see if name already registered snipped
for (int i = 0; i < lastIndex; i++)
{
if (strcmp(name, registeredNames[i] == 0))
{
found = true;
break;
}
}
if (!found)
{
registeredNames[lastIndex++] = name;
}
}

I can understand why you can't use STL - most do bloat your code terribly. However there are implementations for games programmers by games programmers - RDESTL is one such library.

Using:
const char **registeredNames = new const char * [numberOfNames];
will allow you to assign a const * char const to an element of the array.
Just out of curiosity, why does "work mandates that none of the most handy C++ constructs can be used"?

All the approaches suggested are valid, my point is if the way C# does it is appealing replicate it, create your own classes/interfaces to present the same abstraction, i.e. a simple linked list class with methods Contains and Add, using the sample code provided by other answers this should be relatively simple.
One of the great things about C++ is generally you can make it look and act the way you want, if another language has a great implementation of something you can usually reproduce it.

I have used this String class for years.
http://www.robertnz.net/string.htm
It provides practically all the features of the
STL string but is implemented as a true class not a template
and does not use STL.

This is a clear case of you get to roll your own. And do the same for a vector class.
Do it with test-first programming.
Keep it simple.
Avoid reference counting the string buffer if you are in MT environment.

If you are not worried about conventions and just want to get the job done use realloc. I do this sort of thing for lists all of the time, it goes something like this:
T** list = 0;
unsigned int length = 0;
T* AddItem(T Item)
{
list = realloc(list, sizeof(T)*(length+1));
if(!list) return 0;
list[length] = new T(Item);
++length;
return list[length];
}
void CleanupList()
{
for(unsigned int i = 0; i < length; ++i)
{
delete item[i];
}
free(list)
}
There is more you can do, e.g. only realloc each time the list size doubles, functions for removing items from list by index or by checking equality, make a template class for handling lists etc... (I have one I wrote ages ago and always use myself... but sadly I am at work and can't just copy-paste it here). To be perfectly honest though, this will probably not outperform the STL equivalent, although it may equal its performance if you do a ton of work or have an especially poor implementation of STL.
Annoyingly C++ is without an operator renew/resize to replace realloc, which would be very useful.
Oh, and apologies if my code is error ridden, I just pulled it out from memory.

const correctness is still const correctness regardless of whether you use the STL or not. I believe what you are looking for is to make registeredNames a const char ** so that the assignment to registeredNames[i] (which is a const char *) works.
Moreover, is this really what you want to be doing? It seems like making a copy of the string is probably more appropriate.
Moreover still, you shouldn't be thinking about storing this in a list given the operation you are doing on it, a set would be better.

Related

Is it possible to free memory allocated to multidimensional array

I basically have a uint64_t[8][3] I however dont need all the positions in this array .I could create another node structure to dynamically only set proper positions but it would be harder because I rely on the indices of matrix for my particular program. How can I free selectively some indices of the matrix.
For example I don't need uint64_t[4][3] or uint64_t[7][3] for a particular node how do i free this?

You can free the elements in certain indices if they are dynamically allocated using delete(e.g. delete variableName[0][2]), but you can't remove the indices themselves.

This is not possible or at least not the way you want to it. If you are trying to implement a binary tree-like structure take a look at binary tree this way you have tree structure where you could allocate only what you need dynamically.
if you still need arrays do not use old c style arrays, the c++ committee has been actively updating the standard since 1998 and the modern way of working with arrays is using std::array. If you are working in an evironment where you are working with old C-APIs, write your logic in a modern c++ static library compile it and link against that library in your C-API.
That being said If you are worrying about performance or memory issues, you should reconsider your way of thinking, alot of young developers get caught up in this patterns of thinking and end up writing code that is neither fast nor elegant in terms of design (Its called pessimizing the code). When you write a program first start with readable and simple design, then do profiling and figure out what is slow and should be optimized.
if you really need some of the values in the array to be nulls so you can do crazy stuff with it (such as null checking, which I would highly discourage) you should wrap you integers in another class :
class Integer
{
public:
explicit Integer(int value)
{
value_ = value;
};
Integer &operator=(Integer other)
{
value_ = other.value_;
};
private:
int value_;
};
class test
{
public:
test()
{
Integer four(4);
std::array<std::unique_ptr<Integer>, 2> arr = { std::make_unique<Integer>(four), nullptr};
};
};
NOTE I: What I wrote above is only a trick, so you know everything is possible in c++, however the complexity of what I just wrote to achieve what you wanted, should convince you that its not a good approach for your simple case.
NOTE II: The code is not tested, dont use this as is in production code.

You can't free it, it's cpp that's how it works...

How to handle memory management when passing pointers to library functions

I'm working on a project in C++/CLI currently, and I'm trying to wrap my brain around memory allocation. In my code I need to call methods from an external library that I don't have the ability to edit or read the source code. A specific issues I've ran into is the following. They have a class something like this:
class Item {
public:
void SetName(const char* name);
};
Nothing too unusual, it has a method for setting the name... but wait a minute... how am I supposed to free that char* pointer? If I call it from my code like this:
void MyMethod(Item* myItem){
char* name = CallOtherMethodThatGivesMeTheName();
myItem->SetName(name);
};
When am I supposed to deallocate name? I can't stay inside this function forever waiting until myItem no longer uses it, and I have absolutely no clue how the library has implemented SetName().
Should I just delete name within this scope? Won't that mean myItem no longer has a valid pointer though? Maybe I should use some sort of smart pointer? In general, is there a standard way to handle this situation?
And since I am using C++ / CLI, my actual method looks like the following, but I'm looking for a general solution (this might be wrong too).
void MyMethod(Item* myItem, String^ name){
pin_ptr<const wchar_t> pinned_name = PtrToStringChars(name)
myItem->SetName(pinned_name);
};
Thanks for any help!

Ignoring the CLI portion of your question, I provide the following answer. I don't think using CLI is relevant to the core of your question. What you really want to know is when the third party library deems it safe for you to release some allocated string you passed it.
Given the code you provided:
class Item {
public:
void SetName(const char* name);
};
and assuming there is no documentation, I would assume the library makes an internal copy of a c-style string that you provide and that there is no reason to concern yourself with freeing it any more than you would normally. I make this assumption from the context. That would be the proper thing to do. Granted not all third party libraries do the "proper thing", but you cannot be omnipotent.
For example, I would assume:
#include <string>
int main()
{
Item item;
{
std::string name("Sam");
item.SetName(name.c_str());
}
// Carry on doing things with item...
return 0;
}
works just fine, until I examine some behavior that says otherwise. If I am concerned about a third party library behaving poorly, then I test the above in a loop and watch the memory, or witness access violations, but not until I have reason to do so.
In the end:
) Read the documentation for the library.
) Ask on their user groups or forums.
) If none is available, use intuition and context.
) If something unexpected happens, then test.

When you change the value of name after you called SetName and the library uses the originally provided value you know it has been copied.
Otherwise it uses always the current pointer and you cannot deallocate it safely due to the fact you don't know in which order the compiler releases the objects when the program ends.

Usually you only "delete" what you create with "new".
Libraries that return data with pointers, work in two way:
1- Ask you to allocate memory and pass them the pointer (datatype * p)
2- Allocate for you and give you the pointer (return datatype *) or (argument datatype ** p) or (argument datatype * & p)
In first way you know when to free your memory.
In second way, when you finished working with data, you need to call another function from the same library to free the memory for you, pair functions like this pattern are quite common (open close) (alloc free) ...
Of course there can be other ways, so always reading documentation is the best.
In C++/CLI you don't delete objects defined with "^" sign, garbage-collector will take care of them.

how do C++ professional programmers implement common abstractions?

I've never programmed with C++ professionally and working with (Visual) C++ as student. I'm having difficulty dealing with the lack of abstractions especially with the STL container classes. For example, the vector class doesn't contain a simple remove method, common in many libraries e.g. .NET Framework. I know there's an erase method, it doesn't make the remove method abstract enough to reduce the operation to a one-line method call. For example, if I have a
std::vector<std::string>
I don't know how else to remove a string element from the vector without iterating thru it and searching for a matching string element.
bool remove(vector<string> & msgs, string toRemove) {
if (msgs.size() > 0) {
vector<string>::iterator it = msgs.end() - 1;
while (it >= msgs.begin()) {
string remove = it->data();
if (remove == toRemove) {
//std::cout << "removing '" << it->data() << "'\n";
msgs.erase(it);
return true;
}
it--;
}
}
return false;
}
What do professional C++ programmers do in this situation? Do you write out the implementation every time? Do you create your own container class, your own library of helper functions, or do you suggest using another library i.e. Boost (even if you program Windows in Visual Studio)? or something else?
(if the above remove operation needs work, please leave an alternative method of doing this, thanks.)

You would use the "remove and erase idiom":
v.erase(std::remove(v.begin(), v.end(), mystring), v.end());
The point is that vector is a sequence container and not geared towards manipulation by value. Depending on your design needs, a different standard library container may be more appropriate.
Note that the remove algorithm merely reorders elements of the range, it does not erase anything from the container. That's because iterators don't carry information about their container with them, and this is fully intentional: By separating iterators from their containers, one can write generic algorithms that work on any sensible container.
Idiomatic modern C++ would try to follow that pattern whenever applicable: Expose your data through iterators and use generic algorithms to manipulate it.

Have you considered std::remove_if?
http://www.cplusplus.com/reference/algorithm/remove_if/

IMO, professionally, it is perfectly logical to write up custom implementation do custom tasks, especially if the standard doesn't provide that. This is much better than writing (read: copy-paste) the same stuff again and again. One may also take advantage of inline function, template functions, macros to place same stuff in one place. This reduces any bugs that may encounter while re-using the same stuff (which may go somewhat wrong while pasting). It also makes it possible to correct the bug in one place.
Templates and macros, if properly designed, are very useful - they aren't code bloat.
Edit: Your code needs improvement:
bool remove(vector & msgs, cosnt string& toRemove);
To iterator over a collection, a for loop is sufficient. There is no need to check for size, take last iterator, check with begin, get data and all.
There is no need to waste a string - just compare it, and remove.
For your problem, I believe a map or set would fit much better.

How do you use stl's functions like for_each?

I started using stl containers because they came in very handy when I needed the functionality of a list, set and map and had nothing else available in my programming environment. I did not care much about the ideas behind it. STL documentation was interesting up to the point where it came to functions, etc. Then I skipped reading and just used the containers.
But yesterday, still being relaxed from my holidays, I just gave it a try and wanted to go a bit more the stl way. So I used the transform function (can I have a little bit of applause for me, thank you).
From an academic point of view it really looked interesting and it worked. But the thing that bothers me is that if you intensify the use of those functions, you need thousands of helper classes for mostly everything you want to do in your code. The whole logic of the program is sliced into tiny pieces. This slicing is not the result of good coding habits; it's just a technical need. Something, that makes my life probably harder not easier.
I learned the hard way, that you should always choose the simplest approach that solves the problem at hand. I can't see what, for example, the for_each function is doing for me that justifies the use of a helper class over several simple lines of code that sit inside a normal loop so that everybody can see what is going on.
I would like to know, what you are thinking about my concerns? Did you see it like I do when you started working this way and have changed your mind when you got used to it? Are there benefits that I overlooked? Or do you just ignore this stuff as I did (and will go on doing it, probably).
Thanks.
PS: I know that there is a real for_each loop in boost. But I ignore it here since it is just a convenient way for my usual loops with iterators I guess.

The whole logic of the program is sliced in tiny pieces. This slicing is not the result of good coding habits. It's just a technical need. Something, that makes my life probably harder not easier.
You're right, to a certain extent. That's why the upcoming revision to the C++ standard will add lambda expressions, allowing you to do something like this:
std::for_each(vec.begin(), vec.end(), [&](int& val){val++;})
but I also think it is often a good coding habit to split up your code as currently required. You're effectively separating the code describing the operation you want to do, from the act of applying it to a sequence of values. It is some extra boilerplate code, and sometimes it's just annoying, but I think it also often leads to good, clean, code.
Doing the above today would look like this:
int incr(int& val) { return val+1}
// and at the call-site
std::for_each(vec.begin(), vec.end(), incr);
Instead of bloating up the call site with a complete loop, we have a single line describing:
which operation is performed (if it is named appropriately)
which elements are affected
so it's shorter, and conveys the same information as the loop, but more concisely.
I think those are good things. The drawback is that we have to define the incr function elsewhere. And sometimes that's just not worth the effort, which is why lambdas are being added to the language.

I find it most useful when used along with boost::bind and boost::lambda so that I don't have to write my own functor. This is just a tiny example:
class A
{
public:
A() : m_n(0)
{
}
void set(int n)
{
m_n = n;
}
private:
int m_n;
};
int main(){
using namespace boost::lambda;
std::vector<A> a;
a.push_back(A());
a.push_back(A());
std::for_each(a.begin(), a.end(), bind(&A::set, _1, 5));
return 0;
}

You'll find disagreement among experts, but I'd say that for_each and transform are a bit of a distraction. The power of STL is in separating non-trivial algorithms from the data being operated on.
Boost's lambda library is definitely worth experimenting with to see how you get on with it. However, even if you find the syntax satisfactory, the awesome amount of machinery involved has disadvantages in terms of compile time and debug-ability.
My advice is use:
for (Range::const_iterator i = r.begin(), end = r.end(); i != end(); ++i)
{
*out++ = .. // for transform
}
instead of for_each and transform, but more importantly get familiar with the algorithms that are very useful: sort, unique, rotate to pick three at random.

Incrementing a counter for each element of a sequence is not a good example for for_each.
If you look at better examples, you may find it makes the code much clearer to understand and use.
This is some code I wrote today:
// assume some SinkFactory class is defined
// and mapItr is an iterator of a std::map<int,std::vector<SinkFactory*> >
std::for_each(mapItr->second.begin(), mapItr->second.end(),
checked_delete<SinkFactory>);
checked_delete is part of boost, but the implementation is trivial and looks like this:
template<typename T>
void checked_delete(T* pointer)
{
delete pointer;
}
The alternative would have been to write this:
for(vector<SinkFactory>::iterator pSinkFactory = mapItr->second.begin();
pSinkFactory != mapItr->second.end(); ++pSinkFactory)
delete (*pSinkFactory);
More than that, once you have that checked_delete written once (or if you already use boost), you can delete pointers in any sequence aywhere, with the same code, without caring what types you're iterating over (that is, you don't have to declare vector<SinkFactory>::iterator pSinkFactory).
There is also a small performance improvement from the fact that with for_each the container.end() will be only called once, and potentially great performance improvements depending on the for_each implementation (it could be implemented differently depending on the iterator tag received).
Also, if you combine boost::bind with stl sequence algorithms you can make all kinds of fun stuff (see here: http://www.boost.org/doc/libs/1_43_0/libs/bind/bind.html#with_algorithms).

I guess the C++ comity has the same concerns. The to be validated new C++0x standard introduces lambdas. This new feature will enable you to use the algorithm while writing simple helper functions directly in the algorithm parameter list.
std::transform(in.begin(), int.end(), out.begin(), [](int a) { return ++a; })

Local classes are a great feature to solve this. For example:
void IncreaseVector(std::vector<int>& v)
{
class Increment
{
public:
int operator()(int& i)
{
return ++i;
}
};
std::for_each(v.begin(), v.end(), Increment());
}
IMO, this is way too much complexity for just an increment, and it'll be clearer to write it in the form of a regular plain for loop. But when the operation you want to perform over a sequence becomes mor complex. Then I find it useful to clearly separate the operation to be performed over each element from the actual loop sentence. If your functor name is properly chosen, code gets a descriptive plus.

These are indeed real concerns, and these are being addressed in the next version of the C++ standard ("C++0x") which should be published either at the end of this year or in 2011. That version of C++ introduces a notion called C++ lambdas which allow for one to construct simple anonymous functions within another function, which makes it very easy to accomplish what you want without breaking your code into tiny little pieces. Lambdas are (experimentally?) supported in GCC as of GCC 4.5.

Those libraries like STL and Boost are complex also because they need to solve every need and work on any plateform.
As a user of these libraries -- you're not planning on remaking .NET are you? -- you can use their simplified goodies.
Here is possibly a simpler foreach from Boost I like to use:
BOOST_FOREACH(string& item in my_list)
{
...
}
Looks much neater and simpler than using .begin(), .end(), etc. and yet it works for pretty much any iteratable collection (not just arrays/vectors).

I need some C++ guru's opinions on extending std::string

I've always wanted a bit more functionality in STL's string. Since subclassing STL types is a no no, mostly I've seen the recommended method of extension of these classes is just to write functions (not member functions) that take the type as the first argument.
I've never been thrilled with this solution. For one, it's not necessarily obvious where all such methods are in the code, for another, I just don't like the syntax. I want to use . when I call methods!
A while ago I came up with the following:
class StringBox
{
public:
StringBox( std::string& storage ) :
_storage( storage )
{
}
// Methods I wish std::string had...
void Format();
void Split();
double ToDouble();
void Join(); // etc...
private:
StringBox();
std::string& _storage;
};
Note that StringBox requires a reference to a std::string for construction... This puts some interesting limits on it's use (and I hope, means it doesn't contribute to the string class proliferation problem)... In my own code, I'm almost always just declaring it on the stack in a method, just to modify a std::string.
A use example might look like this:
string OperateOnString( float num, string a, string b )
{
string nameS;
StringBox name( nameS );
name.Format( "%f-%s-%s", num, a.c_str(), b.c_str() );
return nameS;
}
My question is: What do the C++ guru's of the StackOverflow community think of this method of STL extension?

I've never been thrilled with this solution. For one, it's not necessarily obvious where all such methods are in the code, for another, I just don't like the syntax. I want to use . when I call methods!
And I want to use $!---& when I call methods! Deal with it. If you're going to write C++ code, stick to C++ conventions. And a very important C++ convention is to prefer non-member functions when possible.
There is a reason C++ gurus recommend this:
It improves encapsulation, extensibility and reuse. (std::sort can work with all iterator pairs because it isn't a member of any single iterator or container class. And no matter how you extend std::string, you can not break it, as long as you stick to non-member functions. And even if you don't have access to, or aren't allowed to modify, the source code for a class, you can still extend it by defining nonmember functions)
Personally, I can't see the point in your code. Isn't this a lot simpler, more readable and shorter?
string OperateOnString( float num, string a, string b )
{
string nameS;
Format(nameS, "%f-%s-%s", num, a.c_str(), b.c_str() );
return nameS;
}
// or even better, if `Format` is made to return the string it creates, instead of taking it as a parameter
string OperateOnString( float num, string a, string b )
{
return Format("%f-%s-%s", num, a.c_str(), b.c_str() );
}
When in Rome, do as the Romans, as the saying goes. Especially when the Romans have good reasons to do as they do. And especially when your own way of doing it doesn't actually have a single advantage. It is more error-prone, confusing to people reading your code, non-idiomatic and it is just more lines of code to do the same thing.
As for your problem that it's hard to find the non-member functions that extend string, place them in a namespace if that's a concern. That's what they're for. Create a namespace StringUtil or something, and put them there.

As most of us "gurus" seem to favour the use of free functions, probably contained in a namespace, I think it safe to say that your solution will not be popular. I'm afraid I can't see one single advantage it has, and the fact that the class contains a reference is an invitation to that becoming a dangling reference.

I'll add a little something that hasn't already been posted. The Boost String Algorithms library has taken the free template function approach, and the string algorithms they provide are spectacularly re-usable for anything that looks like a string: std::string, char*, std::vector, iterator pairs... you name it! And they put them all neatly in the boost::algorithm namespace (I often use using namespace algo = boost::algorithm to make string manipulation code more terse).
So consider using free template functions for your string extensions, and look at Boost String Algorithms on how to make them "universal".
For safe printf-style formatting, check out Boost.Format. It can output to strings and streams.
I too wanted everything to be a member function, but I'm now starting to see the light. UML and doxygen are always pressuring me to put functions inside of classes, because I was brainwashed by the idea that C++ API == class hierarchy.

If the scope of the string isn't the same as the StringBox you can get segfaults:
StringBox foo() {
string s("abc");
return StringBox(s);
}
At least prevent object copying by declaring the assignment operator and copy ctor private:
class StringBox {
//...
private:
void operator=(const StringBox&);
StringBox(const StringBox&);
};
EDIT: regarding API, in order to prevent surprises I would make the StringBox own its copy of the string. I can think fo 2 ways to do this:
Copy the string to a member (not a reference), get the result later - also as a copy
Access your string through a reference-counting smart pointer like std::tr1::shared_ptr or boost:shared_ptr, to prevent extra copying

The problem with loose functions is that they're loose functions.
I would bet money that most of you have created a function that was already provided by the STL because you simply didn't know the STL function existed, or that it could do what you were trying to accomplish.
It's a fairly punishing design, especially for new users. (The STL gets new additions too, further adding to the problem.)
Google: C++ to string
How many results mention: std::to_string
I'm just as likely to find some ancient C method, or some homemade version, as I am to find the STL version of any given function.
I much prefer member methods because you don't have to struggle to find them, and you don't need to worry about finding old deprecated versions, etc,. (ie, string.SomeMethod, is pretty much guaranteed to be the method you should be using, and it gives you something concrete to Google for.)
C# style extension methods would be a good solution.
They're loose functions.
They show up as member functions via intellisense.
This should allow everyone to do exactly what they want.
It seems like it could be accomplished in the IDE itself, rather than requiring any language changes.
Basically, if the interpreter hits some call to a member that doesn't exist, it can check headers for matching loose functions, and dynamically fix it up before passing it on to the compiler.
Something similar could be done when it's loading up the intellisense data.
I have no idea how this could be worked for existing functions, no massive change like this should be taken lightly, but, for new functions using a new syntax, it shouldn't be a problem.
namespace StringExt
{
std::string MyFunc(this std::string source);
}
That can be used by itself, or as a member of std::string, and the IDE can handle all the grunt work.
Of course, this still leaves the problem of methods being spread out over various headers, which could be solved in various ways.
Some sort of extension header: string_ext which could include common methods.
Hmm....
That's a tougher issue to solve without causing issues...

If you want to extend the methods available to act on string, I would extend it by creating a class that has static methods that take the standard string as a parameter.
That way, people are free to use your utilities, but don't need to change the signatures of their functions to take a new class.
This breaks the object-oriented model a little, but makes the code much more robust - i.e. if you change your string class, then it doesn't have as much impact on other code.
Follow the recommended guidelines, they are there for a reason :)

The best way is to use templated free functions. The next best is private inheritance struct extended_str : private string, which happens to get easier in C++0x by the way as you can using constructors. Private inheritance is too much trouble and too risky just to add some algorithms. What you are doing is too risky for anything.
You've just introduced a nontrivial data structure to accomplish a change in code punctuation. You have to manually create and destroy a Box for each string, and you still need to distinguish your methods from the native ones. You will quickly get tired of this convention.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js