C++ Trie search performance - c++

So I made a trie that holds quite a large amount of data, my search algorithm is quite fast but I wanted to see if anyone had any insight as to how I could make it any faster.
bool search (string word)
{
int wordLength = word.length();
node *current = head;
for (unsigned int i=0; i<wordLength; ++i)
{
if (current->child[((int)word[i]+(int)'a')] == NULL)
return false;
else
current = current->child[((int)word[i]+(int)'a')];
}
return current->is_end;
}

Looks good performance-wise, except these tidbits:
Declare the function parameter as const string& (instead of just string), to avoid unnecessary copying.
You could extract the common subexpression current->child[((int)word[i]+(int)'a')] in front of the if, to avoid repetition and make the code slightly smaller, but any compiler worth its salt will do that optimization for you anyway.
"Style" suggestions:
What if word contains character below 'a' (such as capital letter, digit, punctuation mark, new line etc...)? You'll need to validate input to avoid accessing the wrong memory location and crashing. Also shouldn't this be -(int)'a' instead of + (I'm assuming you just want to support a limited set of characters: 'a' and above)?
Declare wordLength as size_t (or better yet auto), but this is not important for strings of any practical length (might even hurt the performance slightly if size_t is greater than int). Ditto for i.

bool search (string word)
Calling this function, string word will be copied,
type of function below will be faster.
bool search (const string &word)
or
bool search (const char *word)

Related

C++ getting length of char array using a second function

I'm trying to get the length of a character array in a second function. I've looked at a few questions on here (1 2) but they don't answer my particular question (although I'm sure something does, I just can't find it). My code is below, but I get the error "invalid conversion from 'char' to 'const char*'". I don't know how to convert my array to what is needed.
#include <cstring>
#include <iostream>
int ValidInput(char, char);
int main() {
char user_input; // user input character
char character_array[26];
int valid_guess;
valid_guess = ValidGuess(user_input, character_array);
// another function to do stuff with valid_guess output
return 0;
}
int ValidGuess (char user_guess, char previous_guesses) {
for (int index = 0; index < strlen(previous_guesses); index++) {
if (user_guess == previous_guesses[index]) {
return 0; // invalid guess
}
}
return 1; // valid guess, reaches this if for loop is complete
}
Based on what I've done so far, I feel like I'm going to have a problem with previous_guesses[index] as well.
char user_input;
defines a single character
char character_array[26];
defines an array of 26 characters.
valid_guess = ValidGuess(user_input, character_array);
calls the function
int ValidGuess (char user_guess, char previous_guesses)
where char user_guess accepts a single character, lining up correctly with the user_input argument, and char previous_guesses accepts a single character, not the 26 characters of character_array. previous_guesses needs a different type to accommodate character_array. This be the cause of the reported error.
Where this gets tricky is character_array will decay to a pointer, so
int ValidGuess (char user_guess, char previous_guesses)
could be changed to
int ValidGuess (char user_guess, char * previous_guesses)
or
int ValidGuess (char user_guess, char previous_guesses[])
both ultimately mean the same thing.
Now for where things get REALLY tricky. When an array decays to a pointer it loses how big it is. The asker has gotten around this problem, kudos, with strlen which computes the length, but this needs a bit of extra help. strlen zips through an array, counting until it finds a null terminator, and there are no signs of character_array being null terminated. This is bad. Without knowing where to stop strlen will probably keep going1. A quick solution to this is go back up to the definition of character_array and change it to
char character_array[26] = {};
to force all of the slots in the array to 0, which just happens to be the null character.
That gets the program back on its feet, but it could be better. Every call to strlen may recount (compilers are smart and could compute once per loop and store the value if it can prove the contents won't change) the characters in the string, but this is still at least one scan through every entry in character_array to see if it's null when what you really want to do is scan for user_input. Basically the program looks at every item in the array twice.
Instead, look for both the null terminator and user_input in the same loop.
int index = 0;
while (previous_guesses[index] != '\0' ) {
if (user_guess == previous_guesses[index]) {
return 0; // prefer returning false here. The intent is clearer
}
index++;
}
You can also wow your friends by using pointers and eliminating the need for the index variable.
while (*previous_guesses != '\0' ) {
if (user_guess == *previous_guesses) {
return false;
}
previous_guesses++;
}
The compiler knows and uses this trick too, so use the one that's easier for you to understand.
For 26 entries it probably doesn't matter, but if you really want to get fancy, or have a lot more than 26 possibilities, use a std::set or a std::unordered_set. They allow only one of an item and have much faster look-up than scanning a list one by one, so long as the list is large enough to get over the added complexity of a set and take advantage of its smarter logic. ValidGuess is replaced with something like
if (used.find(user_input) != used.end())
Side note: Don't forget to make the user read a value into user_input before the program uses it. I've also left out how to store the previous inputs because the question does as well.
1 I say probably because the Standard doesn't say what to do. This is called Undefined Behaviour. C++ is littered with the stuff. Undefined Behaviour can do anything -- work, not work, visibly not work, look like it works until it doesn't, melt your computer, anything -- but what it usually does is the easiest and fastest thing. In this case that's just keep going until the program crashes or finds a null.

How to create a function that removes all of a selected character in a C-string?

I want to make a function that removes all the characters of ch in a c-string.
But I keep getting an access violation error.
Unhandled exception at 0x000f17ba in testassignments.exe: 0xC0000005: Access violation writing location 0x000f787e.
void removeAll(char* &s, const char ch)
{
int len=strlen(s);
int i,j;
for(i = 0; i < len; i++)
{
if(s[i] == ch)
{
for(j = i; j < len; j++)
{
s[j] = s[j + 1];
}
len--;
i--;
}
}
return;
}
I expected the c-string to not contain the character "ch", but instead, I get an access violation error.
In the debug I got the error on the line:
s[j] = s[j + 1];
I tried to modify the function but I keep getting this error.
Edit--
Sample inputs:
s="abmas$sachus#settes";
ch='e' Output->abmas$sachus#settes, becomes abmas$sachus#stts
ch='t' Output-> abmas$sachus#stts, becomes abmas$sachus#ss.
Instead of producing those outputs, I get the access violation error.
Edit 2:
If its any help, I am using Microsoft Visual C++ 2010 Express.
Apart from the inefficiency of your function shifting the entire remainder of the string whenever encountering a single character to remove, there's actually not much wrong with it.
In the comments, people have assumed that you are reading off the end of the string with s[j+1], but that is untrue. They are forgetting that s[len] is completely valid because that is the string's null-terminator character.
So I'm using my crystal ball now, and I believe that the error is because you're actually running this on a string literal.
// This is NOT okay!
char* str = "abmas$sachus#settes";
removeAll(str, 'e');
This code above is (sort of) not legal. The string literal "abmas$sachus#settes" should not be stored as a non-const char*. But for backward compatibility with C where this is allowed (provided you don't attempt to modify the string) this is generally issued as a compiler warning instead of an error.
However, you are really not allowed to modify the string. And your program is crashing the moment you try.
If you were to use the correct approach with a char array (which you can modify), then you have a different problem:
// This will result in a compiler error
char str[] = "abmas$sachus#settes";
removeAll(str, 'e');
Results in
error: invalid initialization of non-const reference of type ‘char*&’ from an rvalue of type ‘char*’
So why is that? Well, your function takes a char*& type that forces the caller to use pointers. It's making a contract that states "I can modify your pointer if I want to", even if it never does.
There are two ways you can fix that error:
The TERRIBLE PLEASE DON'T DO THIS way:
// This compiles and works but it's not cool!
char str[] = "abmas$sachus#settes";
char *pstr = str;
removeAll(pstr, 'e');
The reason I say this is bad is because it sets a dangerous precedent. If the function actually did modify the pointer in a future "optimization", then you might break some code without realizing it.
Imagine that you want to output the string with characters removed later, but the first character was removed and you function decided to modify the pointer to start at the second character instead. Now if you output str, you'll get a different result from using pstr.
And this example is only assuming that you're storing the string in an array. Imagine if you actually allocated a pointer like this:
char *str = new char[strlen("abmas$sachus#settes") + 1];
strcpy(str, "abmas$sachus#settes");
removeAll(str, 'e');
Then if removeAll changes the pointer, you're going to have a BAD time when you later clean up this memory with:
delete[] str; //<-- BOOM!!!
The I ACKNOWLEDGE MY FUNCTION DEFINITION IS BROKEN way:
Real simply, your function definition should take a pointer, not a pointer reference:
void removeAll(char* s, const char ch)
This means you can call it on any modifiable block of memory, including an array. And you can be comforted by the fact that the caller's pointer will never be modified.
Now, the following will work:
// This is now 100% legit!
char str[] = "abmas$sachus#settes";
removeAll(str, 'e');
Now that my free crystal-ball reading is complete, and your problem has gone away, let's address the elephant in the room:
Your code is needlessly inefficient!
You do not need to do the first pass over the string (with strlen) to calculate its length
The inner loop effectively gives your algorithm a worst-case time complexity of O(N^2).
The little tricks modifying len and, worse than that, the loop variable i make your code more complex to read.
What if you could avoid all of these undesirable things!? Well, you can!
Think about what you're doing when removing characters. Essentially, the moment you have removed one character, then you need to start shuffling future characters to the left. But you do not need to shuffle one at a time. If, after some more characters you encounter a second character to remove, then you simply shunt future characters further to the left.
What I'm trying to say is that each character only needs to move once at most.
There is already an answer demonstrating this using pointers, but it comes with no explanation and you are also a beginner, so let's use indices because you understand those.
The first thing to do is get rid of strlen. Remember, your string is null-terminated. All strlen does is search through characters until it finds the null byte (otherwise known as 0 or '\0')...
[Note that real implementations of strlen are super smart (i.e. much more efficient than searching single characters at a time)... but of course, no call to strlen is faster]
All you need is your loop to look for the NULL terminator, like this:
for(i = 0; s[i] != '\0'; i++)
Okay, and now to ditch the inner loop, you just need to know where to stick each new character. How about just keeping a variable new_size in which you are going to count up how long the final string is.
void removeAll(char* s, char ch)
{
int new_size = 0;
for(int i = 0; s[i] != '\0'; i++)
{
if(s[i] != ch)
{
s[new_size] = s[i];
new_size++;
}
}
// You must also null-terminate the string
s[new_size] = '\0';
}
If you look at this for a while, you may notice that it might do pointless "copies". That is, if i == new_size there is no point in copying characters. So, you can add that test if you want. I will say that it's likely to make little performance difference, and potentially reduce performance because of additional branching.
But I'll leave that as an exercise. And if you want to dream about really fast code and just how crazy it gets, then go and look at the source code for strlen in glibc. Prepare to have your mind blown.
You can make the logic simpler and more efficient by writing the function like this:
void removeAll(char * s, const char charToRemove)
{
const char * readPtr = s;
char * writePtr = s;
while (*readPtr) {
if (*readPtr != charToRemove) {
*writePtr++ = *readPtr;
}
readPtr++;
}
*writePtr = '\0';
}

How could I speed up comparison of std::string against string literals?

I have a bunch of code where objects of type std::string are compared for equality against string literals. Something like this:
//const std:string someString = //blahblahblah;
if( someString == "(" ) {
//do something
} else if( someString == ")" ) {
//do something else
} else if// this chain can be very long
The comparison time accumulates to a serious amount (yes, I profiled) and so it'd be nice to speed it up.
The code compares the string against numerous short string literals and this comparison can hardly be avoided. Leaving the string declared as std::string is most likely inevitable - there're thousands lines of code like that. Leaving string literals and comparison with == is also likely inevitable - rewriting the whole code would be a pain.
The problem is the STL implementation that comes with Visual C++11 uses somewhat strange approach. == is mapped onto std::operator==(const basic_string&, const char*) which calls basic_string::compare( const char* ) which in turn calls std::char_traits<char>( const char* ) which calls strlen() to compute the length of the string literal. Then the comparison runs for the two strings and lengths of both strings are passed into that comparison.
The compiler has a hard time analyzing all this and emits code that traverses the string literal twice. With short literals that's not much time but every comparison involves traversing the literal twice instead of once. Simply calling strcmp() would most likely be faster.
Is there anything I could do like perhaps writing a custom comparator class that would help avoid traversing the string literals twice in this scenario?
Similar to Dietmar's solution, but with slightly less editing: you can wrap the string (once) instead of each literal
#include <string>
#include <cstring>
struct FastLiteralWrapper {
std::string const &s;
explicit FastLiteralWrapper(std::string const &s_) : s(s_) {}
template <std::size_t ArrayLength>
bool operator== (char const (&other)[ArrayLength]) {
std::size_t const StringLength = ArrayLength - 1;
return StringLength == s.size()
&& std::memcmp(s.data(), other, StringLength) == 0;
}
};
and your code becomes:
const std:string someStdString = "blahblahblah";
// just for the context of the comparison:
FastLiteralWrapper someString(someStdString);
if( someString == "(" ) {
//do something
} else if( someString == ")" ) {
//do something else
} else if// this chain can be very long
NB. the fastest solution - at the cost of more editing - is probably to build a (perfect) hash or trie mapping string literals to enumerated constants, and then just switch on the looked-up value. Long if/else if chains usually smell bad IMO.
Well, aside from C++14's string_literal, you could easily code up a solution:
For comparison with a single character, use a character literal and:
bool operator==(const std::string& s, char c)
{
return s.size() == 1 && s[0] == c;
}
For comparison with a string literal, you can use something like this:
template<std::size_t N>
bool operator==(const std::string& s, char const (&literal)[N])
{
return s.size() == N && std::memcmp(s.data(), literal, N-1) == 0;
}
Disclaimer:
The first might even be superfluous,
Only do this if you measure an improvement over what you had.
If you have long chain of string literals to compare to there is likely some potential to deal with comparing prefixes to group common processing. Especially when comparing a known set of strings for equality with an input string, there is also the option to use a perfect hash and key the operations off an integer produced by those.
Since the use of a perfect hash will probably have the best performance but also requires major changes of the code layout, an alternative could be to determine the size of the string literals at compile time and use this size while comparing. For example:
class Literal {
char const* d_base;
std::size_t d_length;
public:
template <std::size_t Length>
Literal(char const (&base)[Length]): d_base(base), d_length(Length - 1) {}
bool operator== (std::string const& other) const {
return other.size() == this->d_length
&& !other.memcmp(this->d_base, other.c_str(), this->d_length);
}
bool operator!=(std::string const& other) const { return !(*this == other); }
};
bool operator== (std::string const& str, Literal const& literal) {
return literal == str;
}
bool operator!= (std::string const& str, Literal const& literal) {
return !(str == literal);
}
Obviously, this assumes that your literals don't embed null characters ('\0') other than the implicitly added terminating null character as the static length would otherwise be distorted. Using C++11 constexpr it would be possible to guard against that possibility but the code gets somewhat more complicated without any good reason. You'd then compare your strings using something like
if (someString == Literal("(")) {
...
}
else if (someString == Literal(")")) {
...
}
The fastest string comparison you can get is by interning the strings: Build a large hash table that contains all strings that are ever created. Ensure that whenever a string object is created, it is first looked up from the hash table, only creating a new object if no preexisting object is found. Naturally, this functionality should be encapsulated in your own string class.
Once you have done this, string comparison is equivalent to comparing their addresses.
This is actually quite an old technique first popularized with the LISP language.
The point, why this is faster, is that every string only has to be created once. If you are careful, you'll never generate the same string twice from the same input bytes, so string creation overhead is controlled by the amount of input data you work through. And hashing all your input data once is not a big deal.
The comparisons, on the other hand, tend to involve the same strings over and over again (like your comparing to literal strings) when you write some kind of a parser or interpreter. And these comparisons are reduced to a single machine instruction.
2 other ideas :
A) Build a FSA using a lexical analyser tool like flex, so the string is converted to an integer token value, depending what it matches.
B) Use length, to break up long elseif chains, possibly partly table driven
Why not get the length of the string something, at the top then just compare against the literals it could possibly match.
If there's a lot of them, it may be worth making it table driven and use a map and function pointers. You could just special case the single character literals, for example perhaps using a function lookup table.
Finding non-matches fast and the common lengths may suffice, and not require too much code restructuring, but be more maintainable as well as faster.
int len = strlen (something);
if ( ! knownliterallength[ len]) {
// not match
...
} else {
// First char may be used to index search, or literals are stored in map with *func()
switch (len)
{
case 1: // Could use a look table index by char and *func()
processchar( something[0]);
break;
case 2: // Short strings
case 3:
case 4:
processrunts( something);
break
default:
// First char used to index search, or literals are stored in map with *func()
processlong( something);
break
}
}
This is not the prettiest solution but it has proved quite fast when there is a lot of short strings to be compared (like operators and control characters/keywords in a script parser?).
Create a search tree based on string length and only compare characters. Try to represent known strings as an enumeration if this makes it cleaner in the particular implementation.
Short example:
enum StrE {
UNKNOWN = 0 ,
RIGHT_PAR ,
LEFT_PAR ,
NOT_EQUAL ,
EQUAL
};
StrE strCmp(std::string str)
{
size_t l = str.length();
switch(l)
{
case 1:
{
if(str[0] == ')') return RIGHT_PAR;
if(str[0] == '(') return LEFT_PAR;
// ...
break;
}
case 2:
{
if(str[0] == '!' && str[1] == '=') return NOT_EQUAL;
if(str[0] == '=' && str[1] == '=') return EQUAL;
// ...
break;
}
// ...
}
return UNKNOWN;
}
int main()
{
std::string input = "==";
switch(strCmp(input))
{
case RIGHT_PAR:
printf("right par");
break;
case LEFT_PAR:
printf("left par");
break;
case NOT_EQUAL:
printf("not equal");
break;
case EQUAL:
printf("equal");
break;
case UNKNOWN:
printf("unknown");
break;
}
}

What is the C++ convention when I need to add a useless return statement?

I was trying to write a function that returns the first non-repeated character in a string. The algorithm I made was:
Assert that the string is non-empty
Iterate through the string and add all non-repeated characters to a set
Assert that the set be non-empty
Iterate through string again and return the first character that's in the set
Add a useless return statement to make the compiler happy. (Arbitrarily return 'F')
Obviously my algorithm is very "brute force" and could be improved on. It runs, anyhow. I was wondering if there's a better way to do this and was also wondering what the convention is for useless return statements. Don't be afraid to criticize me harshly. I'm trying to become a C++ stiffler. ;)
#include <iostream>
#include <string>
#include <set>
char first_nonrepeating_char(const std::string&);
int main() {
std::string S = "yodawgIheardyoulike";
std::cout << first_nonrepeating_char(S);
}
// Finds that first non-repeated character in the string
char first_nonrepeating_char(const std::string& str) {
assert (str.size() > 0);
std::set<char> nonRepChars;
std::string::const_iterator it = str.begin();
while (it != str.end()) {
if (nonRepChars.count(*it) == 0) {
nonRepChars.insert(*it);
} else {
nonRepChars.erase(*it);
}
++it;
}
assert (nonRepChars.size() != 0);
it = str.begin();
while (it != str.end()) {
if (nonRepChars.count(*it) == 1) return (*it);
++it;
}
return ('F'); // NEVER HAPPENS
}
The main problem is just getting rid of warnings.
Ideally you should be able to just say
assert( false ); // Should never get here
but unfortunately that does not get rid of all warnings with the compilers I use most, namely Visual C++ and g++.
Instead I do this:
xassert_should_never_get_here();
where xassert_should_never_get_here is a function that
is declared as "noreturn" by compiler-specific means, e.g. __declspec for Visual C++,
has an assert(false) to handle debug builds,
then throws a std::logic_error.
The last two points are accomplished by a macro XASSERT (its actual name in my code is CPPX_XASSERT, it's always a good idea to use prefixes for macro names so as to reduce name conflict probability).
Of course, the assertion that you should not get to the end, is equivalent to an assertion that the argument string does contain at least one non-repeated character, which therefore is a precondition of the function (part of its contract), which I think should be documented by a comment. :-)
There are three main "modern C++" ways of coding things up when you do not have that precondition, namely
choose one char value to signify "no such", e.g. '\0', or
throw an exception in the case of no such, or
return a boxed result which can be logically "empty", e.g. the Boost class corresponding to Barton and Nackmann's Fallible.
About the algorithm: when you're not intested in where the first non-repeating char is, you can avoid the rescan of the string by maintaining a count per character, e.g. by using a map<char, int> instead of a set<char>.
There is a simpler and "cleaner" way of doing it, but it is not computationally faster than "brute force".
Use a table that counts the number of occurrences of each character in the input string.
Then go over the input string one more time, and return the first character whose count is 1.
char GetFirstNonRepeatedChar(const char* s)
{
int table[256] = {0};
for (int i=0; s[i]!=0; i++)
table[s[i]]++;
for (int i=0; s[i]!=0; i++)
if (table[s[i]] == 1)
return s[i];
return 0;
}
Note: the above will work for ASCII strings.
If you're using a different format, then you'll need to change the 256 (and the char of course).

Converting std::string to upper case: major performance difference?

So I was playing around with some code and wanted to see which method of converting a std::string to upper case was most efficient. I figured that the two would be somewhat similar performance-wise, but I was terribly wrong. Now I'd like to find out why.
The first method of converting the string works as follows: for each character in the string (save the length, iterate from 0 to length), if it's between 'a' and 'z', then shift it so that it's between 'A' and 'Z' instead.
The second method works as follows: for each character in the string (start from 0, keep going till we hit a null terminator), apply the build in toupper() function.
Here's the code:
#include <iostream>
#include <string>
inline std::string ToUpper_Reg(std::string str)
{
for (int pos = 0, sz = str.length(); pos < sz; ++pos)
{
if (str[pos] >= 'a' && str[pos] <= 'z') { str[pos] += ('A' - 'a'); }
}
return str;
}
inline std::string ToUpper_Alt(std::string str)
{
for (int pos = 0; str[pos] != '\0'; ++pos) { str[pos] = toupper(str[pos]); }
return str;
}
int main()
{
std::string test = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!##$%^&*()_+=-`'{}[]\\|\";:<>,./?";
for (size_t i = 0; i < 100000000; ++i) { ToUpper_Reg(test); /* ToUpper_Alt(test); */ }
return 0;
}
The first method ToUpper_Reg took about 169 seconds per 100 million iterations.
The second method Toupper_Alt took about 379 seconds per 100 million iterations.
What gives?
Edit: I changed the second method so that it iterates the string how the first one does (set the length aside, loop while less than length) and it's a bit faster, but still about twice as slow.
Edit 2: Thanks everybody for your submissions! The data I'll be using it on is guaranteed to be ascii, so I think I'll be sticking with the first method for the time being. I'll keep in mind that toupper is locale specific for when/if I need it.
std::toupper uses the current locale to do case conversions, which involves a function call and other abstractions. So naturally, it will be slower. But it will also work on non-ASCII text.
toupper() does more than just shift characters in the range [a-z]. For one thing it's locale dependent and can handle more than just ASCII.
toupper() takes the locale into account so it can handle (some) international characters and is much more complex than just handling the character range 'a'-'z'.
Well, ToUpper_Reg() doesn't work. For example, it doesn't turn my name into all uppercase characters. That said, ToUpper_Alt() also doesn't work because it toupper() gets passed a negative value on some platforms, i.e. it creates undefined behavior (normally a crash) when using it with my name. This is easily fixed, though, by correctly calling it something like this:
toupper(static_cast<unsigned char>(str[pos]))
That said, the two versions of the code are not equivalent: the version onot using toupper() isn't writing the characters all the time while the latter version is: once everything is converted to uppercase it always takes the same branch after a test and then does nothing. You might want to change ToUpper_Alt() to look like this and retest:
inline std::string ToUpper_Alt(std::string str)
{
for (int pos = 0; str[pos] != '\0'; ++pos) {
if (islower(static_cast<unsigned char>(str[pos])) {
str[pos] = toupper(static_cast<unsigned char>(str[pos]));
}
}
return str;
}
I would guess the difference is the writing: toupper() trades the comparison for an array look-up. The locale is quickly accessed and all toupper() does is get the current pointer and access the location at a given offset. With data in the cache this is probably as fast as the branch.
The second on involves a function call. a function call is an expensive operation in an inner loop. toupper also uses locales to determine how the character should be changed.
The advances of the call is that it is standard and will work regardless of character encoding on the host machine
That said, I would highly recommend use the boost function:
boost::algorithm::to_upper
It is a template so is more than likely to be inlined, however it does involve locales. I would still use it.
http://www.boost.org/doc/libs/1_40_0/doc/html/boost/algorithm/to_upper.html
I guess it's because the second one calls a C standard library function, that on the one hand isn't inlined, so you got the overhead of a function call. But even more important, this function probably does a lot more than just two comparisons, two jumps and two integer additions. It performs additional checks on the character and takes the current locale into account and all that stuff.
std::toupper uses the current locale and the reason why this is slower than the C function is that the current locale is shared and mutable from different threads, so it's necessary to lock the locale object when it's accessed to ensure it's not switched during the call. This happens once per call to toupper and introduces quite a large overhead (obtaining the lock might require a syscall depending on implementation). One workaround if you want to get the performance and respect the locale is to get the locale object first (creating a local copy) and then call the toupper facet on your copy, thus avoiding the need to lock for each toupper call. See the link below for an example.
http://www.cplusplus.com/reference/std/locale/ctype/toupper/
The question has already been answered, but as an aside, replacing the guts of your loop in the first method with:
std::string::value_type &c = str[pos];
if ('a' <= c && c <= 'z') { c += ('A' - 'a'); }
makes it even faster. Maybe my compiler just sucks.