Shared string in C++? - c++

Which "shared string" implementation for C++ would you recommend?
(Sorry if I missed a similar question. I had a look but could not find any)

I would use the STL: std::string and std::wstring.
ONLY if you need something more fancy you could used the smart pointers to wrap your own implementation. These smart pointers are present in the new C++ STL or boost.
boost::shared_ptr for example if
you use it inside a DLL
boost::intrusive_ptr works over DLL
boundaries.
EDIT: Like remarked in the comments STL strings are not guaranteed to be immutable by nature. If you want them to be so, use the const specifier.

std::(w)string can be shared, but this is not mandated by the standard. QString uses an atomic refcount for sharing.

I recommend starting with the standard strings, std::string and std::wstring.
However, there's one caveat:
Neither of the two string classes enforces a particular encoding. If you want your application to behave well when dealing with other locales or other languages than English, you should either start with std::wstring, or use something like UTF8-CPP which lets you deal with UTF-8 strings.
As Joel Spolsky pointed out, you have to know which encoding your strings are in to handle them correctly.

Related

Avoid pointers and #defines when programming in an Arduino?

I was looking over the StyleGuide for the Arduino when I noticed that in the Commenting your Code section, it recommends to avoid using pointers and #defines.
Is there a reason the writer stated this? There isn't an explanation as to why he/she stated this. It doesn't make sense to me. Is this something specific to embedded systems?
I don't know the specific reason the author wrote it, and I am not familiar with the library's written style - so I am going to answer in general terms of C++ programs.
I assume the preference is given because modern C++ typically favors other idioms, many of which were designed to avoid or minimize the issues frequently introduced by preprocessor and raw pointers.
Avoid Pointers
Instead of a pointer, it is conventional in C++ to use a reference for an object or a container such as a vector for a collection of objects.
//////// For an object
//// Using a pointer
bool getURL(t_url* const outUrl);
// In use:
bool result(obj.getURL(&outUrl));
//// versus using a reference
bool getURL(t_url& outUrl);
// In use:
bool result(obj.getURL(outUrl));
//////// For a collection
//// Using a pointer
bool apply(const double* const values, const size_t& count);
// In use:
bool result(obj.apply(array, count));
//// versus using a container
bool apply(const std::vector<double>& values);
// In use:
bool result(obj.apply(values));
Even pointers may be given object containers (auto pointer, smart pointer, shared pointer, weak pointer) because there can be a lot of complexity or ambiguity when dealing with with raw pointers, particularly in clients' code. It's quite rare that I write C++ programs that take or return raw pointers.
Avoid Defines
Preprocessor/defines are also not generally the preferred approach in C++ - you have inline functions, anonymous namespaces, templates, and enums.
The ubiquitous example of a macro that is problematic for many reasons is #define max(a,b) ((a > b) ? a : b), versus std::max.
Conclusion
If I see a C++ program which uses a considerable amount of either, I find myself wondering in what decade it was written, or if the author was was writing in the "C with some more features" dialect.
Another answerer said the "advice is garbage". I disagree. The advice in Arduino simply says 'avoid pointers' and 'avoid #defines`. Of course, there will be times when you need to use these facilities, but you can write a clearer program when you use the language and library facilities which were intended to replace them (in the common ways they were misused or problematic). To avoid using them means to use them sparingly and only when necessary, favoring more modern and idiomatic alternatives.
That advice is garbage -- pointers are an extremely useful and powerful tool, and quite often they're required in order to do something effectively.
#defines, on the other hand, should often be avoided for a number of reasons (one two three four) in favor of inline functions, but again there are many situations where macros should be used and are the best solution. It depends on your problem -- be smart, and know when to use them and when not to. Don't blindly avoid using them because some FAQ told you not to.

C++ - Is string a built-in data type?

In C++, is string a built-in data type?
Thanks.
What is the definition of built-in that you want to use? Is it built-in the compiler toolset that you have yes, it should. Is it treated specially by the compiler? no, the compiler treats that type as any user defined type. Note that the same can probably be applied to many other languages for which most people will answer yes.
One of the focuses of the C++ committee is keeping the core language to a bare minimum, and provide as much functionality as possible in libraries. That has two intentions: the core language is more stable, libraries can be reimplemented, enhanced... without changing the compiler core. But more importantly, the fact that you do not need special compiler support to handle most of the standard library guarantees that the core language is expressive enough for most uses.
Simpler said in negated form: if the language required special compiler support to implement std::string that would mean that users don't have enough power to express that or a similar concept in the core language.
It's not a primitive -- that is, it's not "built in" the way that int, char, etc are. The closest built-in string-like type is char * or char[], which is the old C way of doing stringy stuff, but even that requires a bunch of library code in order to use productively.
Rather, std::string is a part of the standard library that comes with nearly every modern C++ compiler in existence. You'll need to #include <string> (or include something else that includes it, but really you should include what your code refers to) in order to use it.
If you are talking about std::string then no.
If you are talking about character array, I guess you can treat it as an array of a built in type.
No.
Built-in or "primitive" types can be used to create string-life functionality with the built-in type char. This, along with utility functions were what was used in C. In C++, there is still this functionality but a more intuitive way of using strings was added.
The string class is a part of the std namespace and is an instantiation of the basic_string template class. It is defined as
typedef basic_string<char> string;
It is a class with the ability to dynamically resize as needed and has many member functions acting as utilities. It also uses operator overloading so it is more intuitive to use. However, this functionality also means it has an overhead in terms of speed.
Depends on what you mean by built-in, but probably not. std::string is defined by the standard library (and thus the C++ standard) and is very universally supported by different compilers, but it is not a part of the core language like int or char.
It can be built-in, but it doesn't have to be.
The C++ standard library has a documented interface for its components. This can be realized either as library code or as compiler built-ins. The standard doesn't say how it should be implemented.
When you use #include <string> you have the std::string implementation available. This could be either because the compiler implements it directly, or because it links to some library code. We don't know for sure, unless we check each compiler.
None of the known compilers have chosen to make it a built-in type, because they didn't have to. The performance of a pure library implementation was obviously good enough.
No. It's part of standard library.
No, string is a class.
Definitely not. String is a class from standard library.
char *, or char[] are built-in types, but char, int, float, double, void, bool without any additions (as pointers, arrays, sign or size modifiers - unsigned, long etc.) are fundamental types.
No. There are different imlementations (eg Microsoft Visual C++), but char* is the C++ way of representing strings.

C++ strings, when to use what?

It's been quite some time now that I've been coding in C++ and I think most who actually code in C++, would agree that one of the most trickiest decisions is to choose from an almost dizzying number of string types available. I mostly prefer ATL Cstring for its ease of use and features, but would like a comparative study of the available options.
I've checked out SO and haven't found any content which assists one choosing the right string. There are websites which state conversions from one string to another, but thats not what we want here.
Would love to have a comparison based on specialty, performance, portability (Windows, Mac, Linux/Unix, etc), ease of use/features, multi language support(Unicode/MBCS), cons (if any), and any other special cases.
I'm listing out the strings that I've encountered so far. I believe, there would be more, so we may edit this later to accommodate other options. Mind you, I've worked mostly on Windows, so the list reflects the same:
char*
std::string
STL's basic_string
ATL's CString
MFC's CString
BSTR
_bstr_t
CComBstr
Don't mean to put a dampener on your enthusiasm for this, but realistically it's inefficient to mix a lot of string types in the one project, so the larger the project gets the more inevitably it should settle on std::string (which is a typedef to an instantiation of STL's basic_string for type char, not a different entity), given that's the only Standard value-semantic option. char* is ok mainly for fixed sized strings (e.g. string literals, fixed size buffers) or interfacing with C.
Why do I say it's inefficient? You end up with needless template instantiations for the variety of string arguments (permutations even for multiple arguments). You find yourself calling functions that want to load a result into a string&, then have to call .c_str() on that and construct some other type, doing redundant memory allocation. Even const std::string& requires a string temporary if called using an ASCIIZ char* (e.g. to some other string type's buffer). When you want to write a function to handle the type of string a particular caller wants to use, you're pushed towards templates and therefore inline code, longer compile times and recompilation depedencies (there are some ways to mitigate this, but they get complex and to be convenient or automated they tend to require changes to the various string types - e.g. casting operator or member function returning some common interface/proxy object).
Projects may need to use non-Standard string types to interact with libraries they want to use, but you want to minimise that and limit the pervasiveness if possible.
The sorry story of C++ string handling is too depressing for me to write an essay on, but just a couple of points:
ATL and MFC CString are the same thing (same code and everything). They were merged years ago.
If you're using either _bstr_t or CComBstr, you probably wouldn't use BSTR except on calls into other people's APIs which take BSTR.
char* - fast, features include those that are in < cstring > header, error-prone (too low-level)
std::string - this is actually a typedef for std::basic_string<char, char_traits<char> > A beautiful thing - first of all, it's fast too. Second, you can use all the < algorithm >s because basic_string provides iterators. For wide-character support there is another typedef, wstring which is, std::basic_string<wchar_t, char_traits<wchar_t> >. This (basic_string)is a standard type therefore is absolutely portable. I'd go with this one.
ATL's and MFC's CStrings do not even provide iterators, therefore they are an abomination for me, because they are a class-wrapper around c-strings and they are very badly designed. IMHO
don't know about the rest.
HOpe this partial information helps
Obviously, only the first three are portable, so they should be preferred in most cases. If you're doing C++, then you should avoid char * in most instances, as raw pointers and arrays are error-prone. Interfacing with low-level C, such as in system calls or drivers, is the exception. std:string should be preferred by default, IMHO, because it meshes so nicely with the rest of the STL.
Again, IMHO, if you need to work with e.g. MFC, you should work with everything as std::string in your business logic, and translate to and from CString when you hit the WinApi functions.
2 and 3 are the same. 4 and 5 are the same, too. 7 and 8 are wrappers of 6. So, arguably, the list contains just C's strings, standard C++'s strings, Microsoft's C++ strings, and Microsoft's COM strings. That gives you the answer: in standard C++, use standard C++ strings (std::string)

What are the bad habits of C programmers starting to write C++? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
A discussion recently ended laughing at the bad habits of programmers that have been too exposed to a language when they start programming in another language. The best example would be a Pascal programmer starting to #define begin { and #define end } when starting to write C.
Goal is to try to catch the bad habits of C programmers when they start using C++.
Tell about the big don't that you encountered. One suggestion by answer, please, to try to achieve a kind of best of.
For the ones interested in good habits, have a look at the accepted answer to this question.
Using raw pointers and resources instead of RAII objects.
using char* instead of std::string
using arrays instead of std::vector (or other containers)
not using other STL algorithms or libraries like boost where appropriate
abusing the preprocessor where constants, typedefs or templates would have been better
writing SESE-style (single-entry single exit) code
Declaring all the variables at the top of a function instead of as close as possible to where they are used.
Not using the STL, especially std::string,
and/or
using std::strings and reverting to old c string functions in tight corners.
Writing class definitions that are 2000 lines of code.
Copying and pasting that class definition into 12 different places.
Using switch statements when a simple virtual method would do.
Failing to allocate memory in constructor and deallocate in destructor.
Virtual methods that take optional arguments.
Writing while loops to manipulate char* strings.
Writing giant macro's that are a page in length. (Could have used templates instead).
Adding using's into header files so they can avoid names like std::string in type declarations.
using pointers instead of references
Very experienced developers not understanding casting or even object oriented programming:
I started helping out on a project and one of the senior guys was having a problem with some code that used to work and now didn't.
(Class names have been changed to protect the innocent, and I can't remember the exact names)
He had some C++ code that was listening to incoming message classes and reading them. The way it had worked in the past was that a Message class was passed in and he would interogate a variable on it to find out what type of message it was. He would then C-style cast the Message as another specialised class he'd written that inherited from Message. This new class had functions on it that extracted the data how he wanted it. Now, this had been working for him fine but now was not.
After many hours looking through his code he could not see a problem and I had a look over his shoulder. Immediately I told him that it's not a good idea to C-style cast Message to a derived class which it was not. He disagreed with me and said he'd been doing it for years and if that was wrong then everything he does is wrong because he frequently uses this approach. He was further backed up by a contractor who told me I was wrong. They both argued that this always works and the code hasn't changed so it's not this approach but something else that has broken his code.
I looked a bit further and found the difference. The latest version of the Message class had a virtual function and hadn't previously had any use of virtual. I told the pair of them that there was now a virtual table and functions were being looked up, etc, etc.... and this was causing their problem, etc, etc.... They eventually agreed and I was presented with a comment that I will never forget: "Virtual completely screws up polymorphism and object oriented programming".
I forwarded them a copy of a decorator pattern as an example of how to add a function to an existing class but heard nothing back from them. How they fixed the idea I have no idea.
One word: macros. I am not saying macros have no place at all in C++, but former C programmers tend to use them way too much after they switch to C++.
Using C-style casts.
C++ allows you to independently choose whether to allow casts between unrelated types, and whether to allow changes to const and volatile qualifiers, giving considerable improvements to compile-time type safety compared with C. It also offers completely safe casts at the cost of a runtime check.
C-style casts, unchecked conversions between just about any types, allow whole classes of error that could be easily identified by more restrictive casts. Their syntax also makes them very difficult to search for, if you want to audit buggy code for dubious conversions.
Assuming said programmers have already made the mistake of attempting to learn C++:
Mistakes
Not using STL.
Trying to wrap everything in classes.
Trying to use templates for everything.
Not using Boost. (I know Boost can be a real PITA, and a learning curve, but C++ is just C+ without it. Boost gives C++ some batteries).
Not using smart pointers.
Not using RAII.
Overusing exceptions.
Controversial
Moving to C++. Don't do it.
Try to convert C stdio to iostreams. Iostreams SUX. Don't use it. It's inherently broken. Look here.
Using the following parts of the libstdc++ library:
strings (beyond freeing them for me, go the hell away)
localization (what the hell does this have to do with c++, worse yet, it's awful)
input/output (64 bit file offsets? heard of them?)
Naively believing you can still debug on the command line. Don't use C++ extensively without a code crane (IDE).
Following C++ blogs. C++ blogs carp on about what essentially boils down to metadata and sugar. Beyond a good FAQ, and experience, I've yet to see a useful C++ blog. (Note that's a challenge: I'd love to read a good C++ blog.)
Writing using namespace std because everyone does and then never reflecting on its meaning.
Or knowing what it means but saying "std::cout << "Hello World" << std::endl; looks ugly".
Passing objects with pointers instead of references. Yes, there are still times when you need pointers in C++, but references are safer, so you should use them when you can.
Making everything in a class public. So, data members that should be private aren't.
Not fully understanding the semantics of pointers and references and when to use one or the other. Related to pointers is also the issue of not managing dynamic allocated memory correctly or failing at using "smarter" constructs for that(e.g. smart pointers).
My favourite is the C programmer who writes a single method with multiple, optional, arguments.
Basically, the function would do different things depending on the values and/or nullability of the arguments.
Not using templates when creating algorithms and data structures (example). It makes things either too localized or too generic
I.e. writing
void qsort(MyStruct *begin, size_t length); //too localized
void qsort(void *begin, size_t length,
size_t rec_size, int(compare*)(void*,void*)); //too generic
instead of
template <class RA_Iter>
void qsort(RA_Iter begin, size_t length);
//uses RA_Iter::value_type::operator< for comparison
Well, bad program design transcends languages ( casts, ignoring warnings, unnecessary precompiler magic, unnecessary bit-twiddling, not using the char classification macros ) , and The C language itself doesn't create too many "bad habits" ( Ok, Macros, esp from the stone ages ), and many of the idioms translate directly. But a few that could be considered:
Using a feature just because it's in C++ and so therefore it must be the right way to do something. Some programs just don't need Inheritance, MI, exceptions, RTTI, templates ( great as they are ... the debugging load is steep ), or Virtual class stuff.
Sticking with some code snippet from C, without thinking if C++ has a better way. ( There's a reason you now have class, private, public, const (expanded beyond C89) , static class funcs, references.
Not being familiar with the C++ i/o lib ( its BIG, and you do need to know it) , and mixing C++ i/o and C i/o.
He thinks that C++ is just a little more different language from C. He will continue programming C masked by C++. No advanced use of classes, the structs are considered less powerful than classes, namespace, new headers, templates, nothing of these new elements are used. He will continue declaring integer vars without int, he will not provide functions prototypes. He will use malloc and free, unsafe pointers and preprocessor to define inline functions. This is just a small list ;)
Confused uses of structs vs. classes, overuse of global methods that take object pointers as arguments, and globally-accessible instance pointers, a la:
extern Application* g_pApp;
void RunApplication(Application* app, int flags);
Also (not saying it's totally useless, but still):
const void* buf;
Declaring all the variables at the start of the function itself even if the variable will be used only after 100 lines or so.
Happens especially for local variables declared inside a function.
Solving the problem instead of creating a class-based monstrosity guaranteed to keep you in health insurance and 401K benefits.
Implementing lisp in a single file and doing the design in that.
Writing normal readable functions instead of overriding operators?
Writing in a style which can be understood by the junior programmers which see good practice as "not writing in C++".
Talking to the OS in it's own language.
Not leaving well enough alone, and using C instead.

C++: Should I use strings or char arrays, in general?

I'm a bit fuzzy on the basic ways in which programmers code differently in C and C++. One thing in particular is the usage of strings in C++ over char arrays, or vice versa. So, should I use strings or char arrays, in general, and why?
In C++ you should in almost all cases use std::string instead of a raw char array.
std::string manages the underlying memory for you, which is by itself a good enough reason to prefer it.
It also provides a much easier to use and more readable interface for common string operations, e.g. equality testing, concatenation, substring operations, searching, and iteration.
If you're modifying or returning the string, use std::string. If not, accept your parameter as a const char* unless you absolutely need the std::string member functions. This makes your function usable not only with std::string::c_str() but also string literals. Why make your caller pay the price of constructing a std::string with heap storage just to pass in a literal?
Others have put it. Use the std::string stuff wherever possible. However there are areas where you need char *, e.g if you like to call some system-services.
As is the case with everything what you choose depends on what you're doing with it. std::string has real value if you're dealing with string data that changes. You can't beat char[] for efficiency when dealing with unchanging strings.
Use std::string.
You will have less problems (I think almost none, at least none come to my mind) with buffer sizes.
C has char[] while c++ has std::string too...
I commonly hear that one should "Embrace the language" and, following that rule, you should use std::string...
However, its pretty much up to what library are you using, how does that library want you to express your strings, stuff like that.
std::string is a container class, and inside it, is a char[]
If you use std::string, you have many advantages, such as functions that will help you [compare, substr, as examples]