This question may be very straightforward but I am rather inexperienced with c++ and got stuck while writing a simple parser.
For some reason one of the string comparison functions would not return the expected value when called.
The function looks like this:
template<int length>
bool Parser::compare(const char *begin, const char *str){
int i = 0;
while(i != length && compareCaseInsensitive(*begin, *str)){
i++;
begin++;
str++;
}
return i == length;
};
The purpose of this function was to compare a runtime character buffer with a compile time constant string vb
compare<4>(currentByte, "<!--");
I know there are more efficient ways to compare a fixed length character buffer (and used one later on) but I was rather puzzled when I ran this function and it always returns false, even with two identical strings.
I checked with the debugger and checked the value of i at the end of the loop and it was equal to the value of the template parameter but still the return expression evaluated to false.
Are there any special rules about working with int template parameters ?
I assumed the template parameter would behave like a compile time constant.
I don't know if this is relevant but I'm running gcc's g++ compiler and debugged with gdb.
If anyone could tell me what might cause this problem it would be highly appreciated.
The functions used in this piece of code:
template<typename Character>
Character toLowerCase(Character c){
return c > 64 && c < 91 ? c | 0x10 : c;
};
template<typename Character>
bool equalsCaseInsensitive(Character a, Character b){
return toLowerCase(a) == toLowerCase(b);
};
For doing case-insensitive string comparisons, I would try using the STL function std::strcoll from the header <cstring> which has signature
int strcoll( const char* lhs, const char* rhs );
and compares two null-terminated byte strings according to the current locale. Or if you want to roll your own, you could still use std::tolower from the header <cctype> which has signature
int tolower( int ch );
and converts the given character to lowercase according to the character conversion rules defined by the currently installed C locale.
Related
I have
#define ARG(TEXT , REPLACEMENT) replace(#TEXT, REPLACEMENT)
so
QString str= QString("%ONE, %TWO").ARG(%ONE, "1").ARG(%TWO, "2");
becomes
str= QString("%ONE, %TWO").replace("%ONE", "1").replace("%TWO", "2");
//str = "1, 2"
The problem is that VS2019, when formatting the code (Edit.FormatSelection) interprets that % sign as an operator and adds a whitespace
QString str= QString("%ONE, %TWO").ARG(% ONE, "1").ARG(% TWO, "2");
(I think it's a bug in VS). The code compiles without warnings.
As I am dealing with some ancient code that has this "feature" spread, I'm worried to auto-format text containing this and break functionality.
Is there a way at compile time to detect such arguments to a macro having space(s)?
Is there a way at compile time to detect such arguments to a macro having space(s)?
Here's what I would do:
#define ARG(TEXT, REPLACEMENT) \
replace([]{ \
static constexpr char x[] = #TEXT; \
static_assert(x[0] == '%' && x[1] != ' '); \
return x; \
}(), REPLACEMENT)
Apparently some time in the next decade C++ will provide a better solution, and indeed there might be a much less clunky solution than the one I provide below, but it's maybe a place to start.
This version uses the Boost Preprocessor library to do a repetition which would have been straight-forward to write with a template if C++ allowed string literals as template arguments, a feature which has not yet gotten into the standard for motivations I can only guess at. So it doesn't actually test whether the argument has no spaces; rather it tests that there are no spaces in the first 64 characters (where 64 is an almost entirely arbitrary number which can be changed as your needs dictate). I used the Boost Preprocessor library; you could do this with your own special purpose macros if for some reason you don't want to use Boost.
#include <boost/preprocessor/repetition/repeat.hpp>
#define NO_SPACE_AT_N(z, N, s) && (N >= sizeof(s) || s[N] != ' ')
#define NO_SPACE(s) true BOOST_PP_REPEAT(64, NO_SPACE_AT_N, s)
// Arbitrary constant, change as needed---^
// Produce a compile time error if there's a space.
template<bool ok> struct NoSpace {
const char* operator()(const char* s) {
static_assert(ok, "Unexpected space");
return s;
}
};
#define ARG(TEXT, REPL) replace(NoSpace<NO_SPACE(#TEXT)>()(#TEXT), REPL)
(Test on gcc.godbolt.)
If the question is to produce a compilation error when the first argument of ARG contains a space, I managed to get this to work:
#include <cstdlib>
template<size_t N>
constexpr int string_validate( const char (&s)[N] )
{
for (int i = 0; i < N; ++i)
if ( s[i] == ' ' )
return 0;
return 1;
}
template<int N> void assert_const() { static_assert(N, "string validation failed"); }
int replace(char const *, char const *) { return 0; } // dummy for example
#define ARG(TEXT , REPLACEMENT) replace((assert_const<string_validate(#TEXT)>(), #TEXT), REPLACEMENT)
int main()
{
auto b = ARG(%TWO, "2");
auto a = ARG(% ONE, "1"); // causes assertion failure
}
Undoubtedly there is a shorter way. Prior to C++20 you can't use a string literal in a template parameter, hence the constexpr function to produce an integer from the string literal and then we can check the integer at compile-time by using it as a template parameter.
It's unlikely.
Visual Studio works on source code, without running the preprocessor first and without performing what would be quite a difficult computation to work out whether the preprocessor would fundamentally alter the line it's formatting.
Besides, people don't really use macros in this way any more, or shouldn't (we have cheap functions!).
So this isn't really what the formatting feature expects.
If you can modify the code, make the user write .ARG("%ONE", "1"), then not only does the problem go away but also the would be more consistent.
Otherwise, you'll have to stick with formatting the code by hand.
Good day,
I am writing a simple C++ Linked List using templates. I have got everything working, but I wanted to add to the functionality by making it case insensitive by converting all characters to lowercase for when the template is of type string.
So, I wrote the following snippet to handle any word and convert it to all lower cases:
#define TEMPLATE string // changing this changes the template type in the rest of the program
Stack <TEMPLATE> s; //not used in this example, but just to show that I have an actual template for a class declared at some point, not just a definition called TEMPLATE
TEMPLATE word; // User inputs a word that is the same type of the Linked List Stack to compare if it is in the Stack.
cin >> word; // just showing that user defines word
for (unsigned int i = 0; i < word.length(); i++)
{
if (word.at(i) >= 'A' && word.at(i) <= 'Z')
word.at(i) += 'a' - 'A';
}
The problem is that when the TEMPLATE of my Stack, and subsequently the compared word to the stack is not of type string, then it obviously throws error messages because the for loop was written specifically to look at strings.
So, is there a way I could make this function more generic so that any type can be passed? (I don't think so, since there would be no error checking for ints, etc. String is the only one that relies on this)
Or, is there a way that I can only execute the above code when my Template for my Stack and compared variable is of type string?
I looked at exception handling, except I'm very much used to how Python works and so I could not figure out exactly how to implement in C++ instead.
Just as a side note, I am not using any built in functions to convert the string to all lower cases, so that is not an option either and I am not looking for recommendations of those.
Create overloads to normalize your data:
std::string normalize(const std::string& s) {
std::string res(s);
for (auto& c : res) {
c = std::tolower(c);
}
return res;
}
template <typename T>
const T& normalize(const T& t) { return t; }
I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.
for (int i = 0; i < string.length(); i++)
{
if (isalpha(string[i]))
{
if (islower(string[i]))
{
string[i] = toupper(string[i]);
}
}
}
Why do you have to list string[i] twice? Shouldn't this work?
toupper(string[i]); (I tried it, so I know it doesn't.)
toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:
// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
...
In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.
To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
char& c = string[i]; // reference to character inside string
c = std::toupper(c);
}
Using range-based for, the above can be written more briefly (and executed more efficiently) as:
for (auto& c: string)
c = std::toupper(c);
As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.
The prototype of toupper is:
int toupper( int ch );
As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.
As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.
What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.
#include <cctype>
#include <iostream>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
for (auto s = text; *s != '\0'; ++s)
*s = std::toupper(*s);
std::cout << text << '\n';
}
You could further eliminate the raw loop by using an algorithm, if you find this prettier.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
std::transform(std::cbegin(text), std::cend(text), std::begin(text),
[](auto c){ return std::toupper(c); });
std::cout << text << '\n';
}
toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.
Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.
So what, you think.
Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).
In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.
The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):
int x = toupper ('é');
So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.
See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."
I am looking for the best way to remove the start of a string until the last occurrence of a character. For example. I have a char array that contains the following:
[Microsoft][ODBC SQL Server Driver][SQL Server]Option 'enable_debug_reports' requires a value of 0 or 1.
Basically, I am looking for the last occurrence of ']'. I would like my char array to be trimmed to:
Option 'enable_debug_reports' requires a value of 0 or 1.
I have found several ways to do this with the string data type. I am wondering if there is an effective way to manipulate a char array. My program requires several parameters to be char[] instead of strings. How would I use something like strcpy in my situation?
The below should work provided your string does contain the ']' character:
std::string trimIt(originalCStr);
std::string trimmed(trimIt.substr(trimIt.find_last_of("]")));
strcpy(originalCStr, trimmed.c_str());
For a pure C approach:
char *toPtr = originalCStr;
char *fromPtr = strchr(toPtr, ']');
++fromPtr;
while (*fromPtr != '\0') {
*toPtr = *fromPtr;
++fromPtr;
++toPtr;
}
*toPtr = '\0';
you could use strrchr to find the last occurrence of ']' and then cut your char[] using memcpy as seen here
If you insist on not using std::string, for whatever reason, there is still a pure C++ approach using standard algorithms, which work just fine with raw arrays. The following is C++14 (it uses std::rbegin and std::rend), but you can adapt it to C++11 using std::reverse_iterator manually if necessary:
#include <algorithm>
#include <iostream>
template <class InputRange, class OutputIterator, class Value>
void CopyFrom(InputRange const& input, OutputIterator output_iter, Value const& value)
{
using std::rbegin;
using std::rend;
using std::end;
auto const iter_last = std::find(rbegin(input), rend(input), value);
std::copy(iter_last.base(), end(input), output_iter);
}
int main()
{
char const src[] = "[Microsoft][ODBC SQL Server Driver][SQL Server]Option 'enable_debug_reports' requires a value of 0 or 1.";
char * dst = new char[sizeof(src) + 1](); // just for this toy program
CopyFrom(src, dst, ']');
std::cout << dst;
delete[] dst;
}
Note that this solution assumes that you need the substring as a copy, and there is no error checking for input that does not contain the specified value.
And of course, you are probably better of switching to std::string and using c_str() for any C APIs.
I'm working with rapidxml, so I would like to have comparisons like this in the code:
if ( searchNode->first_attribute("name")->value() == "foo" )
This gives the following warning:
comparison with string literal results in unspecified behaviour [-Waddress]
Is it a good idea to substitute it with:
if ( !strcmp(searchNode->first_attribute("name")->value() , "foo") )
Which gives no warning?
The latter looks ugly to me, but is there anything else?
You cannot in general use == to compare strings in C, since that only compares the address of the first character which is not what you want.
You must use strcmp(), but I would endorse this style:
if( strcmp(searchNode->first_attribute("name")->value(), "foo") == 0) { }
rather than using !, since that operator is a boolean operator and strcmp()'s return value is not boolean. I realize it works and is well-defined, I just consider it ugly and confused.
Of course you can wrap it:
#include <stdbool.h>
static bool first_attrib_name_is(const Node *node, const char *string)
{
return strcmp(node->first_attribute("name")->value(), string) == 0;
}
then your code becomes the slightly more palatable:
if( first_attrib_name_is(searchNode, "foo") ) { }
Note: I use the bool return type, which is standard from C99.
If the value() returns char* or const char*, you have little choice - strcmp or one of its length-limiting alternatives is what you need. If value() can be changed to return std::string, you could go back to using ==.
When comparing char* types with "==" you just compare the pointers. Use the C++ string type if you want to do the comparison with "=="
You have a few options:
You can use strcmp, but I would recommend wrapping it. e.g.
bool equals(const char* a, const char* b) {
return strcmp(a, b) == 0;
}
then you could write: if (equals(searchNode->first_attribute("name")->value(), "foo"))
You can convert the return value to a std::string and use the == operator
if (std::string(searchNode->first_attribute("name")->value()) == "foo")
That will introduce a string copy operation which, depending on context, may be undesirable.
You can use a string reference class. The purpose of a string reference class is to provide a string-like object which does not own the actual string contents. I've seen a few of these and it's simple enough to write your own, but since Boost has a string reference class, I'll use that for an example.
#include <boost/utility/string_ref.hpp>
using namespace boost;
if (string_ref(searchNode->first_attribute("name")->value()) == string_ref("foo"))