Convert string to all uppercase leters with std::transform - c++

I'm using transform algorithm and std::toupper to achieve this, but can this be done in one line, like this ?
transform(s.begin(), s.end(), ostream_iterator<string>(cout, "\n"),std::toupper);
I get error on this, so do I have to make a unary function for this and call it with transform or I can use some adaptors ?

Use ostream_iterator<char> instead of ostream_iterator<string>:
transform(s.begin(),s.end(),ostream_iterator<char>(cout,"\n"),std::toupper);
std::transform transforms each character and pass it to the output iterator. That is why the type argument of the output iterator should be char instead of std::string.
By the way, each character will be printed on a newline. Is that what you want? If not, don't pass "\n".
--
Note : You may have to use ::toupper instead of std::toupper.
See these
http://www.ideone.com/x6FB5 (each character on a newline)
http://www.ideone.com/RcEKn (all characters on the same line)

First, if you want to output chars (and all of the chars), you'll
need to use ostreambuf_iterator<char>, and not
ostream_iterator<string>. And ostreambuf_iterator<char> expresses
better what you want than ostream_iterator<char>; you're
outputting chars directly, not formatting anything.
(ostream_iterator uses the << operator, which formats.)
Second, be aware that there is not always a one to one translation of
lower to upper (e.g. 'ß' maps to the two character sequence "SS" in
upper case), so std::transform can't really be used to do the job
correctly. (And of course, it doesn't handle multibyte encodings like
UTF-8 correctly.) For all but the simplest uses, you need something
more complicated. But even for the simplest cases:
std::toupper is overloaded: one of the overloads is a template, which
takes two arguments, and the other is a function which takes a single
int; neither will work directly here, and the fact that transform is
also a template means that overload resolution and template type
deduction won't work even if they did. So basically, you have to add
something. It's possible to use the 2 argument template function if you
add enough qualifiers and use boost::bind or something similar to bind
the second argument, but it's almost as much text as writing a simple
toupper functional argument your self. And you can't use the single
argument form (which can be unambiguously accessed if you include
<ctype.h> and use ::toupper) because it has undefined behavior if
you use a char as the argument when you call it: you have to convert
the char to unsigned char first (unless, of course, plain char is
unsigned in the implementation you are using—and in all
implementations to which your code will ever be ported).

Related

How do I compare base36 values in C++?

As we all know, with C-style string comparisons, the value is dependent on the ASCII value of each character and just uses the strcmp function to compare. I'm confused that what the std::string compare depends on?
Although I have searched Google, I still didn't find the answer.
In addition, if strings are all base36 strings and they are all in lower case, could I compare their values by strings directly? Or I should convert them as a long variable using the strtol function? Which method is better?
Your outset "As we all know, with C-style string comparisons, the value is dependent on the ASCII value of each character..." is unfortunately wrong already. With e.g. UTF-8 strings and various forms of collation, that's simply untrue.
Then "...and just uses the strcmp function to compare." is also wrong, because C-style strings don't have an inherent way to compare but multiple ways that also depend on e.g. the encoding and locale. You could use strcmp() for bytewise equality comparisons though, but that won't always give you expected results.
To answer your question what std::string uses, that's simple. std::string is a specialization of the std::basic_string template and it delegates comparisons to its char_traits template parameter. This parameter typically uses memcmp(). It can not possibly use strcmp(), because other than a C-style string, std::string can include null chars, but strcmp() would stop at those.
std::string compare depends upon 'ASCII values' in exactly the same way that strcmp does.
For base36 comparisons, simple string comparison (either strcmp or std::string) doesn't work because "00123" and "123" are equal when representing base36 integers but they compare differently as strings. Neither does strtol work very well because of integer overflow. Instead you should probably write your own comparison routine that removes leading zeros, then compares length and finally for strings of equal length does a string comparison.

Is it better to use std::string or single char when possible?

Is it better to use std::string or single char when possible?
In my class I want to store certain characters. I have CsvReader
class, and I want to store columnDelimiter character. I wonder,
is it better to have it as char, or just use std::string?
In terms of usage I suppose std::string is far better, but I wonder
maybe there will be major performance differences?
If your delimiter is constrained to be a single character, use a char.
If your delimiter may be a string, use a std::string.
Seems fairly self-explanatory. Refer to the requirements of the project, and the constraints of the feature that follow from those requirements.
Personally it seems to me that a CSV field delimiter will always be a single character, in which case std::string is not only misleading, but pointlessly heavy.
In terms of usage I suppose std::string is far better
I have largely ignored this claim as you did not provide any rationale, but let me just say that I reject the hypothetical premise of the claim.
I wonder maybe there will be major performance differences?
Absolutely! A string consists of a dynamically-allocated block of characters; this is entirely more heavy than a single byte in memory. Notwithstanding the small-string-optimisation that your implementation may perform, it's simply pointless to add all this weight when all you wish to represent is a single character. A single character is a char, so use a char in such a case.
A character is a character. A string is a string; conceptually, a set of N characters, where N is any natural number.
If your design requires a character, use char. If it requires a string, use string.
In both cases you may have multilanguage issues (what happens if the characteer is 青? what happens if the string is 青い?), but these are totally independent of your choice of whether you need a character or a set of N characters, i.e. a string.

c++ best way to call function with const char* parameter type

what is the best way to call a function with the following declaration
string Extract(const char* pattern,const char* input);
i use
string str=Extract("something","input text");
is there a problem with this usage
should i use the following
char pattern[]="something";
char input[]="input";
//or use pointers with new operator and copy then free?
the both works but i like the first one but i want to know the best practice.
A literal string (e.g. "something") works just fine as a const char* argument to a function call.
The first method, i.e. passing them literally in, is usually preferable.
There are occasions though where you don't want your strings hard-coded into the text. In some ways you can say that, a bit like magic numbers, they are magic words / phrases. So you prefer to use constant identifier to store the values and pass those in instead.
This would happen often when:
1. a word has a special meaning, and is passed in many times in the code to have that meaning.
or
2. the word may be cryptic in some way and a constant identifier may be more descriptive
Unless you plain to have duplicates of the same strings, or alter those strings, I'm a fan of the first way (passing the literals directly), it means less dotting about code to find what the parameters actually are, it also means less work in passing parameters.
Seeing as this is tagged for C++, passing the literals directly allows you to easily switch the function parameters to std::string with little effort.

boost split with a single character or just one string

I wish to split a string on a single character or a string. I would like to use boost::split since boost string is our standard for basic string handling (I don't wish to mix several techniques).
In the single character case I could do split(vec,str,is_any_of(':')) but I'd like to know if there is a way to specify just a single character. It may improve performance, but more importantly I think the code would be clearer with just a single character, since is_any_of conveys a different meaning that what I want.
For matching against a string I don't know what syntax to use. I don't wish to to construct a regex; some simple syntax like split(vec,str,match_str("::") would be good.
I was looking for the same answer but I couldn't find one. Finally I managed to produce one on my own.
You can use std::equal_to to form the predicate you need. Here's an example:
boost::split(container, str, std::bind1st(std::equal_to<char>(), ','));
This is exactly how I do it when I need to split a string using a single character.
In the following code, let me assume using namespace boost for brevity.
As for splitting on a character, if only algorithm/string is allowed,
is_from_range might serve the purpose:
split(vec,str, is_from_range(':',':'));
Alternatively, if lambda is allowed:
split(vec,str, lambda::_1 == ':');
or if preparing a dedicated predicate is allowed:
struct match_char {
char c;
match_char(char c) : c(c) {}
bool operator()(char x) const { return x == c; }
};
split(vec,str, match_char(':'));
As for matching against a string, as David Rodri'guez mentioned,
there seems not to be the way with split.
If iter_split is allowed, probably the following code will meet the purpose:
iter_split(vec,str, first_finder("::"));
On the simple token, I would just leave is_any_of as it is quite easy to understand what is_any_of( single_option ) means. If you really feel like changing it, the third element is a functor, so you could pass an equals functor to the split function.
That approach will not really work with multiple tokens, as the iteration is meant to be characater by character. I don't know the library enough to offer prebuilt alternatives, but you can implement the functionality on top of split_iterators

Overloading a method on default arguments

Is it possible to overload a method on default parameters?
For example, if I have a method split() to split a string, but the string has two delimiters, say '_' and "delimit". Can I have two methods something like:
split(const char *str, char delim = ' ')
and
split(const char *str, const char* delim = "delimit");
Or, is there a better way of achieving this? Somehow, my brain isn't working now and am unable to think of any other solution.
Edit: The problem in detail:
I have a string with two delimiters, say for example, nativeProbableCause_Complete|Alarm|Text. I need to separate nativeProbableCause and Complete|Alarm|Text and then further, I need to separate Complete|Alarm|Text into individual words and join them back with space as a separator sometime later (for which I already have written a utility and isn't a big deal). It is only the separation of the delimited string that is troubling me.
No, you cant - if you think about it, the notion of a default means 'use this unless I say otherwise'. If the compiler has 2 options for a default, which one will it choose?
How about implementing as 2 different methods like
split_with_default_delimiter_space
split_with_default_delimiter_delimit
Personally I'd prefer using something like this (more readable.. intent conveying) over the type of overloading that you mentioned... even if it was somehow possible for the compiler to do that.
Why not just call split() twice and explicitly pass the delimiter the second time? Will delimiters always be single characters?
Do you perform any other processing on the 2nd set of words before joining them? If not, then for the second task what you really want to do is replace substrings. This is most easily done with std::string::find and std::string::replace. If you must use c-strings, you could use strstr/strchr/strpbrk, strcpy and strcat, or use just strstr/strchr/strpbrk and join them in place.
You could use a version of split that accepts a variable number of delimiters (split(const char*,vector<string>), if you want to split(const char*, const char**)) or just use Boost Tokenizer.