Char array pointer vs string refrence in params - c++

I often see the following structure, especially in constructors:
class::class(const string &filename)
{
}
class::class(const char * const filename)
{
}
By step-by-step debug, I found out the 2nd constructor is always called if I pass a hard-coded string.
Any idea:
1) Why the dual structure is used?
2) What is the speed difference?
Thanks.

Two constructors are needed because you can pass a NULL to your MyClass::MyClass(const std::string &arg). Providing second constructor saves you from a silly crash.
For example, you write constructor for your class, and make it take a const std::string & so that you don't have to check any pointers to be valid if you'd be using const char*.
And everywhere in your code you're just using std::strings. At some point you (or another programmer) pass there a const char*. Here comes nice part of std::string - it has a constructor, which takes char*, and that's very good, apart from the fact, that std::string a_string(NULL) compiles without any problems, just doesn't work.
That's where a second constructor like you've shown comes handy:
MyClass::MyClass(const char* arg)
: m_string(arg ? arg : "")
{}
and it will make a valid std::string object if you pass it a NULL.
In this case I don't think you'd need to worry about any speed. You could try measuring, although I'm afraid you'd be surprised with how little difference (if any) there would be.
EDIT: Just tried std::string a_string(NULL);, compiles just fine, and here's what happens when it is run on my machine (OS X + gcc 4.2.1) (I do recall I tried it on Windows some time ago, result was very similar if not exactly same):
std::logic_error: basic_string::_S_construct NULL not valid

This is useful if the implementation deals with const char*s by itself, but is mostly called by std::string users. These can call using the std::string API, which usually just calls c_str() and dispatches to the const char* implementation. On the other side, if the caller does already have a c-string, no temporary or unneeded std::string needs to be constructed (which can be costly, for longer strings it's a heap allocation).
Also, I once used it to resolve the following case:
My interface took std::string's, but had to be implemented in an external module, thus the STL binary versions of both the module AND the caller module had to match exactly, or it would have crashed (not really good for a portable library… ). So I changed the actual interface to use const char*, and added std::string overloads which I declared inline, so they weren't exported. It didn't break existing code, but resolved all my module boundary problems.

1) Why the dual structure is used?
The string reference version is required if std::string objects are to be used conveniently as parametersm as there is no implicit conversion from a std::string to a const char const. The const char * const version is optional, as character arrays can implicitly be converted into std::strings, but it is more efficient, as no temporary std::string need be created.
2) What is the speed difference?
You will need to measure that yourself.

They are offered basically for convenience. Some times, if you call C functions, you get char* pointers. Others, you get strings, so offering both constructors is just a convenience for the caller. As for the speed, both have virtually the same speed, as they both send a memory address to the constructor.

Related

Avoiding the func(char *) api on embedded

Note:
I heavily changed my question to be more specific, but I will keep the old question at end of the post, in case it is useful to anyone.
New Question
I am developing an embedded application which uses the following types to represent strings :
string literals(null terminated by default)
std::array<char,size> (not null terminated)
std::string_view
I would like to have a function that accepts all of them in a uniform way. The only problem is that if the input is a string literal I will have to count the size with strlen that in both other cases doesn't work but if I use size it will not work on case 1.
Should I use a variant like so: std::variant<const char *,std::span<char>> ? Would that be heavy by forcing myself to use std::visit ? Would that thing even match correctly all the different representations of strings?
Old Question
Disclaimer when I refer to "string" in the following context I don't mean an std::string but just an abstract way to say alphanumeric series.
Most of the cases when I have to deal with strings in c++ I use something like void func(const std::string &); or without the const and the reference at some cases.Now on an embedded app I don't have access to std::string and I tried to use std::string_view the problem is that std::string_view when constructed from a non literal sometimes is not null terminated
Edit: I changed the question a bit as the comments implied some very helphull hints .
So even though y has a size in the example below:
std::array<char,5> x{"aa"} ;
std::string_view y(x.data());
I can't use y with a c api like printf(%s,y.data()) that is based on null termination
#include <array>
#include <string_view>
#include "stdio.h"
int main(){
std::array<char,5> x{"aaa"};
std::string_view y(x.data());
printf("%s",x);
}
To summarize:
What can I do to implement a stack allocated string that implicitly gets a static size at its constructors (from null terminated strings,string literals, string_views and std::arrays) and it is movable (or cheap copyable)?
What would be the underlying type of my class? What would be the speed costs in comparison with the underlying type?
I think that you are looking at two largely and three subtly different semantics of char*.
Yes, all of them point at char but the type-specific info on how to determine the length is not carried by that. Even in the ancient ancestor of C++ (not saying C...) a pointer to char was not always the same. Already there pointers to terminated and non-terminated sequences of characters could not be mixed.
In C++ the tool of overloading a function exists and it seems to be the obvious solution for your problem. You can still implement that efficiently with only one (helper) function doing the actual work, based on an explicit size information in a second parameter.
Overload the function which is "visible" on the API, with three versions for the three types. Have it determine the length in the appropriate way, then call the single helper function, providing that length.

When and why should you use String&, String and Const?

I was working on a piece of code.
struct Argument
{
Argument(): s_name(""), name(""), optional(true) {}
Argument(String& s_name_inp, String& name_inp, bool optional_inp):s_name(s_name_inp),name(name_inp),optional(optional_inp){}
.....More code.....
}
Somewhere later in the code:
void addArgument(String& name_inp,bool optional=true)
{
String name;
//Creating a tmep string to store the corrected name if the user doesn't enter - or -- wrt. name of their argument.
name = isDashed(name_inp) ? name_inp : addDash(name_inp) ;
//using the dashed name to check if it's shortname or long name.
if(name.size()>3)
{
//This is the long name.
Argument arg("", name, optional);
insertArgument(arg);
}
else
{
//This is for the short name
Argument arg(name, "", optional);
insertArgument(arg);
}
}
Both Struct Argument and fn addArgument are part of a class where struct Argument is defined in the private and addArgument in the public.
It throws up an error when I run the code..
For Long name one-
error: no instance of constructor "ArgumentParser::Argument::Argument" matches the argument list
argument types are: (const char [1], ArgumentParser::String, __nv_bool)
For Short Name one -
error: no instance of constructor "ArgumentParser::Argument::Argument" matches the argument list
argument types are: (ArgumentParser::String, const char [1], __nv_bool)
I could figure out how to fix it. This error is coming because of the empty strings "" which I enter. Adding const in the struct Argument constructor fixes the problem.
struct Argument
{
Argument(): s_name(""), name(""), optional(true) {}
Argument(const String& s_name_inp, const String& name_inp, bool optional_inp):s_name(s_name_inp),name(name_inp),optional(optional_inp){}
.....More code.....
}
Similarly, declaring a String blank = "" ; and passing it while initializing an obj of struct Argument instead of "" curbs the problem as well. Also, passing String instead of String& in the Argument constructor also solves the issue-
Argument(String s_name_inp, String name_inp, bool optional_inp):s_name(s_name_inp),name(name_inp),optional(optional_inp){}
Thus what I concluded from this is that simply passing "" is a problem since it is not stored in some variable. It doesn't have some location which can be referenced(String&) in case a change is made inside the code. That's why adding a const before String& in the constructor ensures the compiler that no change will be made even though we are passing by ref, so the compiler allows the use of "".
However I do not understand why the compiler is being so 'smart' even though I haven't done any incorrect operation. It's like the compiler is popping bugs for 'security' also, apart from the usual 'errors' we make.
With this question, I also want to understand in a broader sense, the use of const.
I get that it's just for ensuring that no change will be made to the variable being passed(or returned) to a function/constructor. But why do we need it if I, as the programmer, can ensure that I won't be changing the variable. Why do we need const then?
One thing I know is, it can be used to tell others to keep a parameter unchanged if they see your code.(https://www.youtube.com/watch?v=4fJBrditnJU is a good source for learning about const) Also, what is the difference between use of const in these two cases -
int somefunc const(int& var)
{
// var=4; This isn't allowed due to the const in fn
return var;
}
int somefunc(const int& var)
{
// var =4 ; This is also not allowed.
return var;
}
Also, I want to know how passing by reference and passing by value differ if they are just being used for assignments, i.e, no change is being made to them but they are being used for checking conditions, or doing some assignments to other vars.
Adding const String& is like a surety that whatever the user is passing is not being tampered with, in the code, so can't we simply replace it with String? Because passing by value will instantiate a copy of the passed variable/parameter? Why do we use const String& then?
Another question about passing pointers and passing by reference: The only use I know of String&(or int& or any other) is to directly pass the actual 'thing' into the function, not a copy of it, so whatever we do with that 'thing' will be reflected on the original, just like we use pointers to get the changes to be made to the actual 'thing' and not a copy of it. Why don't we use pointers instead of passing by reference? What advantage does it bring?
I know this question is kind of vast but these are all interconnected questions. Answer to one compliments another. Thanks to anyone who takes the time to answer whatever they seems suitable!
Pointers and l-value references are exchangeable, for the most part. It's just less to write at the site of invocation.
The difference between const & and & comes, as you noticed, from a simple reference requiring a variable to reference. A reference with & has the semantic of "let me write that down for you". A const & allows the creation of the temporary copy on-the-fly, and has the semantic of "let me have a look". Pass by value has the meaning of "give me a copy, I decide what to do with it".
In practical experience, avoiding copies is your primary concern. So call-by-value is something you want to avoid for anything bigger than an integral value.
Const-correctness is mostly just design of the C++ language, it's not strictly required from a technical point of view. You can consider it to be a way of saying "trust me, I'm not going to break it".
About passing temporary values to a simple & parameter, think about it for a moment who actually owns the temporary, and for how long it's going to exist. Anything written to it, if you would be allowed to do so, would be lost.
Also think about default parameters, e.g. void foo(const bar &foobar = bar(42)). These are never allowed to be non-const references, as it would result in undefined behavior. That default object may live in a static scope (rather than every caller creating it anew), and someone messing with it would result in changing defaults. Good luck ever finding the cause for that bug.
Even for non-default parameters, const & allows the creation of the temporary at compile time (constexpr constructors), and also folding multiple instances of identical temporaries into a single instance in memory. This optimization would likewise not be possible without the guarantees made by const.
There is a plethora of other cases where const-correctness is also the key to enable compiler optimizations. So it's generally better to use correctly, even if you assumed that your code discipline would had prevented at least undefined behavior.
But why do we need it if I, as the programmer, can ensure that I won't be changing the variable. Why do we need const then?
You could also ensure to not make any errors in coding, so why compiler errors/warnings, unit tests, issue trackers, …
Why would you use the keyword const if you already know variable should be constant?
Use const wherever possible in C++?
Adding const String& is like a surety that whatever the user is passing is not being tampered with, in the code, so can't we simply replace it with String? Because passing by value will instantiate a copy of the passed variable/parameter? Why do we use const String& then?
A class could be more expensive to copy then the indirections over the reference are. Especially if the compiler would be able to inline the function in case of a const & for which no indirection would happen at all.
A copy could introduce unwanted side effects (could be problematic in an environment with limited resources)
A class could have a deleted copy constructor so no copy would be possible at all, but you still want to ensure const correctness.
The only use I know of String&(or int& or any other) is to directly pass the actual 'thing' into the function, not a copy of it, so whatever we do with that 'thing' will be reflected on the original, just like we use pointers to get the changes to be made to the actual 'thing' and not a copy of it. Why don't we use pointers instead of passing by reference? What advantage does it bring?
A pointer can accept a nullptr so you need to handle the case where nullptr is passed (if no nullptr must be passed gsl::not_null could be used). But using pointers would not allow passing temporaries.

How string accepting interface should look like?

This is a follow up of this question. Suppose I write a C++ interface that accepts or returns a const string. I can use a const char* zero-terminated string:
void f(const char* str); // (1)
The other way would be to use an std::string:
void f(const string& str); // (2)
It's also possible to write an overload and accept both:
void f(const char* str); // (3)
void f(const string& str);
Or even a template in conjunction with boost string algorithms:
template<class Range> void f(const Range& str); // (4)
My thoughts are:
(1) is not C++ish and may be less efficient when subsequent operations may need to know the string length.
(2) is bad because now f("long very long C string"); invokes a construction of std::string which involves a heap allocation. If f uses that string just to pass it to some low-level interface that expects a C-string (like fopen) then it is just a waste of resources.
(3) causes code duplication. Although one f can call the other depending on what is the most efficient implementation. However we can't overload based on return type, like in case of std::exception::what() that returns a const char*.
(4) doesn't work with separate compilation and may cause even larger code bloat.
Choosing between (1) and (2) based on what's needed by the implementation is, well, leaking an implementation detail to the interface.
The question is: what is the preffered way? Is there any single guideline I can follow? What's your experience?
Edit: There is also a fifth option:
void f(boost::iterator_range<const char*> str); // (5)
which has the pros of (1) (doesn't need to construct a string object) and (2) (the size of the string is explicitly passed to the function).
If you are dealing with a pure C++ code base, then I would go with #2, and not worry about callers of the function that don't use it with a std::string until a problem arises. As always, don't worry too much about optimization unless there is a problem. Make your code clean, easy to read, and easy to extend.
There is a single guideline you can follow: use (2) unless you have very good reasons not to.
A const char* str as parameter does not make it explicit, what operations are allowed to be performed on str. How often can it be incremented before it segfaults? Is it a pointer to a char, an array of chars or a C string (i.e. a zero-terminated array of char)?
I don't really have a single hard preference. Depending on circumstances, I alternate between most of your examples.
Another option I sometimes use is similar to your Range example, but using plain old iterator ranges:
template <typename Iter>
void f(Iter first, Iter last);
which has the nice property that it works easily with both C-style strings (and allows the callee to determine the length of the string in constant time) as well as std::string.
If templates are problematic (perhaps because I don't want the function to be defined in a header), I sometimes do the same, but using char* as iterators:
void f(const char* first, const char* last);
Again, it can be trivially used with both C-strings and C++ std::string (as I recall, C++03 doesn't explicitly require strings to be contiguous, but every implementation I know of uses contiguously allocated strings, and I believe C++0x will explicitly require it).
So these versions both allow me to convey more information than the plain C-style const char* parameter (which loses information about the string length, and doesn't handle embedded nulls), in addition to supporting both of the major string types (and probably any other string class you can think of) in an idiomatic way.
The downside is of course that you end up with an additional parameter.
Unfortunately, string handling isn't really C++'s strongest side, so I don't think there is a single "best" approach. But the iterator pair is one of several approaches I tend to use.
For taking a parameter I would go with whatever is simplest and often that is const char*. This works with string literals with zero cost and retrieving a const char* from something stored in a std:string is typically very low cost as well.
Personally, I wouldn't bother with the overload. In all but the simplest cases you will want to merge to two code paths and have one call the other at some point or both call a common function. It could be argued that having the overload hides whether one is converted to the other or not and which path has a higher cost.
Only if I actually wanted to use const features of the std::string interface inside the function would I have const std::string& in the interface itself and I'm not sure that just using size() would be enough of a justification.
In many projects, for better or worse, alternative string classes are often used. Many of these, like std::string give cheap access to a zero-terminated const char*; converting to a std::string requires a copy. Requiring a const std::string& in the interface is dictating a storage strategy even when the internals of the function don't need to specify this. I consider it this to be undesirable, much like taking a const shared_ptr<X>& dictates a storage strategy whereas taking X&, if possible, allows the caller to use any storage strategy for a passed object.
The disadvantages of a const char* are that, purely from an interface standpoint, it doesn't enforce non-nullness (although very occasionally the difference betweem a null parameter and an empty string is used in some interfaces - this can't be done with std::string), and a const char* might be the address of just a single character. In practice, though, the use of a const char* to pass a string is so prevalent that I would consider citing this as a negative to be a fairly trivial concern. Other concerns, such as whether the encoding of the characters specified in the interface documentation (applies to both std::string and const char*) are much more important and likely to cause more work.
The answer should depend heavily on what you are intending to do in f. If you need to do some complex processing with the string, the approach 2 makes sense, if you simply need to pass to some other functions, then select based on those other functions (let's say for arguments sake you are opening a file - what would make most sense? ;) )
It's also possible to write an
overload and accept both:
void f(const string& str) already accepts both because of the implicit conversion from const char* to std::string. So #3 has little advantage over #2.
I would choose void f(const string& str) if the function body does not do char-analysis; means it's not referring to char* of str.
Use (2).
The first stated problem with it is not an issue, because the string has to be created at some point regardless.
Fretting over the second point smells of premature optimization. Unless you have a specific circumstance where the heap allocation is problematic, such as repeated invocations with string literals, and those cannot be changed, then it is better to favor clarity over avoiding this pitfall. Then and only then might you consider option (3).
(2) clearly communicates what the function accepts, and has the right restrictions.
Of course, all 5 are improvements over foo(char*) which I have encountered more than I would care to mention.

How often do you declare your functions to be const?

Do you find it helpful?
Every time You know that method won't change state of the object you should declare it to be constant.
It helps reading your code. And it helps when you try to change state of the object - compiler will stop you.
As often as possible. Functions that don't need to change data members should be declared as const. This makes code more understandable and may give hint to the compiler for optimization.
When you have a const object, the only methods that the compiler will let you call are those marked safe by the const keyword. In fact, only member methods make sense as const methods.
In C++, every method of an object receives an implicit this pointer to the object; A const method will simply receive a const this pointer.
Assuming you're talking about methods, as in:
struct Example {
void f() const;
};
Then if they should be callable on a const object, the method should be const.
Not often enough....
While all the answers are correct, if you are using a libary that is not const correct then it is difficult to use const all the places you should use it.
If you have an old API that takes a char * that for all logical purposes should be a const char *, then you either have to forget const in your code or do some ugly casting. In that case I forget const.
I use const at almost every opportunity, and like the fact it provides both documentation of intent and enforces compliance with that intent. Language features don't get much better than that, and yet const is curiously unloved. (The reality seems to be that the majority of self-proclaimed C++ coders can't explain the difference between int*, int*const, const int* and const int*const.)
While it could never have happened due to its 'C' origins, I often think C++ would be a better language if const had been the default and a liberal sprinkling of (say) 'var' or some similar keyword was necessary to allow post-construction modification of variables.
I used to declare functions as const but now I rarely if ever do it anymore.
The main problem was that if I wanted to change a function from const to non-const, it would mean that all other const functions calling that function would also need to be changed to non-const.
That happened more often than I thought due to optimization. For example I had a GetData() function which used to return a pointer to the data, but I later optimized to only set up the data if GetData() ends up being called (which changes the object's state, so it's no longer a const function).
Same thing for other functions that could do some calculation without changing the object's state, but at some point it made more sense caching the result since the function was called many times and was a bottleneck.
Also in practice, at least for my project, I saw very little benefit from declaring my functions as const.

operator char* in STL string class

Why doesn't the STL string class have an overloaded char* operator built-in? Is there any specific reason for them to avoid it?
If there was one, then using the string class with C functions would become much more convenient.
I would like to know your views.
Following is the quote from Josuttis STL book:
However, there is no automatic type
conversion from a string object to a
C-string. This is for safety reasons
to prevent unintended type conversions
that result in strange behavior (type
char* often has strange behavior) and
ambiguities (for example, in an
expression that combines a string and
a C-string it would be possible to
convert string into char* and vice
versa). Instead, there are several
ways to create or write/copy in a
C-string, In particular, c_str() is
provided to generate the value of a
string as a C-string (as a character
array that has '\0' as its last
character).
You should always avoid cast operators, as they tend to introduce ambiguities into your code that can only be resolved with the use of further casts, or worse still compile but don't do what you expect. A char*() operator would have lots of problems. For example:
string s = "hello";
strcpy( s, "some more text" );
would compile without a warning, but clobber the string.
A const version would be possible, but as strings must (possibly) be copied in order to implement it, it would have an undesirable hidden cost. The explicit c_str() function means you must always state that you really intend to use a const char *.
The string template specification deliberately allows for a "disconnected" representation of strings, where the entire string contents is made up of multiple chunks. Such a representation doesn't allow for easy conversion to char*.
However, the string template also provides the c_str method for precisely the purpose you want: what's wrong with using that method?
By 1998-2002 it was hot topic of c++ forums. The main problem - zero terminator. Spec of std::?string allows zero character as normal, but char* string doesn't.
You can use c_str instead:
string s("I like rice!");
const char* cstr = s.c_str();
I believe that in most cases you don't need the char*, and can work more conveniently with the string class itself.
If you need interop with C-style functions, using a std::vector<char> / <wchar_t> is often easier.
It's not as convenient, and unfortunately you can't O(1)-swap it with a std::string (now that would be a nice thing).
In that respect, I much prefer the interface of MFC/ATL CString which has stricter performance guarantees, provides interop, and doesn't treat wide character/unicode strings as totally foreign (but ok, the latter is somewhat platform specific).