Algorithm for template argument deduction (as strings)? - c++

For some reason, I want to implement something simliar to the what C++ compilers used to deduce template arguments. With a known set of template parameters like
"T0", "T1", "T2"...
Given 2 strings like:
str_param = "vector<T0>"
str_arg = "vector<float>"
The result should be that "T0" is mapped to "float":
map["T0"]=="float"
I don't need a full-featured template preprocessor, which means, I'll be satisfied if I can just handle the cases where the template argument can be literally deduced. No need to consider things like "typedef" in the context.
In other words, if I use the resulted map to replace template parameters in str_param, it should become str_arg. If that is not possible, I consider it as "fail to match".
I currenlty have problem handling cases like:
str_param = "T1*"
str_arg = "int**"
Where expected result is:
map["T1"]=="int*"
My algorithm mistakes it as:
map["T1"]=="int"
Putting my poor algorithm here:
std::vector<std::string> templ_params({"T1"});
std::vector<std::string> templ_args(templ_params.size());
std::string str_param = "T1*";
std::string str_arg = "int**";
const char* p_str_param = str_param.c_str();
const char* p_str_arg = str_arg.c_str();
while (*p_str_param != 0 && *p_str_arg != 0)
{
while (*p_str_param == ' ' || *p_str_param == '\t') p_str_param++;
while (*p_str_arg == ' ' || *p_str_arg == '\t') p_str_arg++;
if (*p_str_param == 0 || *p_str_arg == 0) break;
if (*p_str_param != *p_str_arg)
{
std::string templ_param;
std::string templ_arg;
while (*p_str_param == '_' ||
(*p_str_param >= 'a' && *p_str_param <= 'z') ||
(*p_str_param >= 'A' && *p_str_param <= 'Z') ||
(*p_str_param >= '0' && *p_str_param <= '9'))
templ_param += *(p_str_param++);
while (*p_str_param == ' ' || *p_str_param == '\t') p_str_param++;
char end_marker = *p_str_param;
const char* p_str_arg_end = p_str_arg;
while (*p_str_arg_end != end_marker) p_str_arg_end++;
while (*(p_str_arg_end - 1) == ' ' || *(p_str_arg_end - 1) == '\t')
p_str_arg_end--;
while (p_str_arg<p_str_arg_end) templ_arg += *(p_str_arg++);
for (size_t i=0; i < templ_params.size(); j++)
{
if (templ_params[i]==templ_param)
{
templ_args[i]=templ_arg;
break;
}
}
}
else
{
p_str_param++;
p_str_arg++;
}
}

Related

iterate char by char through vector of strings

I want to iterate char by char in a vector of strings. In my code I created a nested loop to iterate over the string, but somehow I get an out of range vector.
void splitVowFromCons(std::vector<std::string>& userData, std::vector<std::string>& allCons, std::vector<std::string>& allVows){
for ( int q = 0; q < userData.size(); q++){
std::string userDataCheck = userData.at(q);
for ( int r = 0; r < userDataCheck.size(); r++){
if ((userDataCheck.at(r) == 'a') || (userDataCheck.at(r) == 'A') || (userDataCheck.at(r) == 'e') || (userDataCheck.at(r) == 'E') || (userDataCheck.at(r) == 'i') || (userDataCheck.at(r) == 'I') || (userDataCheck.at(r) == 'o') || (userDataCheck.at(r) == 'O') || (userDataCheck.at(r) == 'u') || (userDataCheck.at(r) == 'U')){
allVows.push_back(userData.at(r));
}
else if ((userDataCheck.at(r) >= 'A' && userDataCheck.at(r) <= 'Z') || (userDataCheck.at(r) >= 'a' && userDataCheck.at(r) <= 'z')){
allCons.push_back(userData.at(r));
}
else {
continue;;
}
}
}
}
The error here is in these lines:
allVows.push_back(userData.at(r));
allCons.push_back(userData.at(r));
the r variable is your index into the current string, but here you're using it to index into the vector, which looks like a typo to me. You can make this less error prone using range-for loops:
for (const std::string& str : userData) {
for (char c : str) {
if (c == 'a' || c == 'A' || ...) {
allVows.push_back(c);
}
else if (...) {
....
}
}
}
which I hope you'll agree also has the benefit of being more readable due to less noise. You can further simplify your checks with a few standard library functions:
for (const std::string& str : userData) {
for (char c : str) {
if (!std::isalpha(c)) continue; // skip non-alphabetical
char cap = std::toupper(c); // capitalise the char
if (cap == 'A' || cap == 'E' || cap == 'I' || cap == 'O' || cap == 'U') {
allVows.push_back(c);
}
else {
allCons.push_back(c);
}
}
}
Since this question is about debugging actually, I think it is a nice illustration of how the usage of std::algorithms of C++ can decrease the effort needed to see what is wrong with a non working code.
Here is how it can be restructured:
bool isVowel(char letter)
{
return letter == 'A' || letter == 'a' ||
letter == 'E' || letter == 'e'||
letter == 'O' || letter == 'o'||
letter == 'Y' || letter == 'y'||
letter == 'U' || letter == 'u';
}
bool isConsonant(char letter)
{
return std::isalpha(letter) && !isVowel(letter);
}
void categorizeLetters(const std::vector<std::string> &words, std::vector<char> &vowels, std::vector<char> &consonants)
{
for( const std::string &word : words){
std::copy_if(word.begin(), word.end(), std::back_inserter(vowels), isVowel);
std::copy_if(word.begin(), word.end(), std::back_inserter(consonants), isConsonant);
}
}
With a solution like this, you avoid the error-prone access-with-index that lead to your problem. Also, code is readable and comprehensive

String and character comparison

I am new to programming and I need to search any string to see if it includes only the letters a,b,c,d,e or f. The minute the program finds a letter that is not one of those the program should return false. Here is my function
bool is_favorite(string word){
int length = word.length(); // "word" is the string.
int index = 0;
while (index < length) {
if ((word[index] == 'a') || (word[index] == 'b') || (word[index] == 'c')||
(word[index] == 'd')|| (word[index] == 'e')|| (word[index] == 'f')) {
return true;
}
else {
return false;
}
index++;
}
}
Thank you very much for nay help! :)
The moment the return statement is encountered, the function is exited. This means that the moment any of the characters 'a', 'b', 'c', 'd', 'e', 'f' is encountered while iterating, due to the return statement the function will be exited immediately.
You can use std::string::find_first_not_of as shown below:
std::string input = "somearbitrarystring";
std::string validChars = "abcdef";
std::size_t found = input.find_first_not_of(validChars);
if(found != std::string::npos)
std::cout << "Found nonfavorite character " <<input[found]<<" at position "<<found<< std::endl;
else
{
std::cout<<"Only favorite characters found"<<std::endl;
}
If you unroll the loop by hand, you will spot the problem immediately:
if ((word[0] == 'a') || (word[0] == 'b') || (word[0] == 'c')||
(word[0] == 'd')|| (word[0] == 'e')|| (word[0] == 'f')) {
return true;
}
else {
return false;
}
if ((word[1] == 'a') || (word[1] == 'b') || (word[1] == 'c')||
(word[1] == 'd')|| (word[1] == 'e')|| (word[1] == 'f')) {
return true;
}
else {
return false;
}
//...
That is, the return value depends only on the first element.
"The minute the program finds a letter that is not one of those the program should return false" means
if ((word[0] != 'a') || (word[0] != 'b') || (word[0] != 'c')||
(word[0] != 'd')|| (word[0] != 'e')|| (word[0] != 'f')) {
return false;
}
if ((word[1] != 'a') || (word[1] != 'b') || (word[1] != 'c')||
(word[1] != 'd')|| (word[1] != 'e')|| (word[1] != 'f')) {
return false;
}
// ...
// After checking all the characters, you know what all them were in
// your desired set, so you can return unconditionally.
return true;
or, with a loop:
while (index < length) {
if ((word[index] != 'a') || (word[index] != 'b') || (word[index] != 'c')||
(word[index] != 'd')|| (word[index] != 'e')|| (word[index] != 'f')) {
return false;
}
index++;
}
return true;
bool is_favorite(string word){
return ( word.find_first_not_of( "abcdef" ) == std::string::npos );
}
It returns true if, and only if, there are only the characters 'a' through 'f' in the string. Any other character ends the search immediately.
And if you exchange string word with const string & word, your function will not have to create a copy of each word you pass to it, but work on a read-only reference to it, improving efficiency.
bool is_favorite(string word){
int length = word.length(); // "word" is the string.
int index = 0;
while (index < length) {
if (word[index] > 'f' || word[index] < 'a')
return false;
index++;
}
return true;
}
The return true is logically in the wrong place in your code.
Your version returns true as soon as it finds one letter that is a through f. It's premature to conclude that the whole string is valid at that point, because there may yet be an invalid character later in the string.
bool is_favorite(string word){
int length = word.length(); // "word" is the string.
int index = 0;
while (index < length) {
if ((word[index] == 'a') || (word[index] == 'b') || (word[index] == 'c')||
(word[index] == 'd')|| (word[index] == 'e')|| (word[index] == 'f')) {
return true; // This is premature.
}
else {
return false;
}
index++;
}
}
Minimal change that illustrates where the return true should be: after the loop. The return true is reached only if and only if we did not detect any invalid characters in the loop.
bool is_favorite(string word){
int length = word.length(); // "word" is the string.
int index = 0;
while (index < length) {
if ((word[index] == 'a') || (word[index] == 'b') || (word[index] == 'c')||
(word[index] == 'd')|| (word[index] == 'e')|| (word[index] == 'f')) {
// Do nothing here
}
else {
return false;
}
index++;
}
return true;
}
Obviously now that the affirmative block of the if is empty, you could refactor a little and only check for the negative condition. The logic of it should read closely to the way you described the problem in words:
"The minute the program finds a letter that is not one of those the program should return false."
bool is_favorite(string word){
int length = word.length(); // "word" is the string.
int index = 0;
while (index < length) {
if (!is_letter_a_through_f((word[index])
return false;
index++;
}
return true;
}
I replaced your large logical check against many characters with a function in the above code to make it more readable. I trust you do that without difficulty. My own preference is to keep statements short so that they are readable, and so that when you read the code, you can hold in your short-term memory the logic of what you are saying about control flow without being overloaded by the mechanics of your letter comparison.

Reduce cyclomatic complexity

I'm writing an NMEAParser library. As its name suggests, it parses NMEA sentences. Nothing crazy.
Its entry point is a function that accepts an NMEA string as its only parameter and looks at its beginning to pass it to the right decoder. Here is the function:
bool NMEAParser::dispatch(const char *str) {
if (!str[0]) {
return false;
}
//check NMEA string type
if (str[0] == '$') {
//PLSR245X
if (str[1] == 'P' && str[2] == 'L' && str[3] == 'S' && str[4] == 'R' && str[5] == ',' && str[6] == '2' && str[7] == '4' && str[8] == '5' && str[9] == ',') {
if (str[10] == '1')
return parsePLSR2451(str);
if (str[10] == '2')
return parsePLSR2452(str);
if (str[10] == '7')
return parsePLSR2457(str);
} else if (str[1] == 'G' && str[2] == 'P') {
//GPGGA
if (str[3] == 'G' && str[4] == 'G' && str[5] == 'A')
return parseGPGGA(str);
//GPGSA
else if (str[3] == 'G' && str[4] == 'S' && str[5] == 'A')
return parseGPGSA(str);
//GPGSV
else if (str[3] == 'G' && str[4] == 'S' && str[5] == 'V')
return parseGPGSV(str);
//GPRMC
else if (str[3] == 'R' && str[4] == 'M' && str[5] == 'C')
return parseGPRMC(str);
//GPVTG
else if (str[3] == 'V' && str[4] == 'T' && str[5] == 'G')
return parseGPVTG(str);
//GPTXT
else if (str[3] == 'T' && str[4] == 'X' && str[5] == 'T')
return parseGPTXT(str);
//GPGLL
else if (str[3] == 'G' && str[4] == 'L' && str[5] == 'L')
return parseGPGLL(str);
}
//HCHDG
else if (str[1] == 'H' && str[2] == 'C' && str[3] == 'H' && str[4] == 'D' && str[5] == 'G')
return parseHCHDG(str);
}
return false;
}
The problem I have is that this function's cyclomatic complexity is quite high, and my SonarQube complains about it:
It's not really a problem as the code is quite easy to read. But I was wondering how I could reduce its complexity while still keeping it simple to read and efficient.
You can simplify this quite a lot:
if (std::string_view{str, 10} == "$PLSR,245,")
{
switch (str[10])
{
case '1' : return parsePLSR2451(str);
case '2' : return parsePLSR2452(str);
case '7' : return parsePLSR2457(str);
}
}
else if (std::string_view{str + 1, 2} == "GP")
{
auto s = std::string_view{str + 3, 3};
if (s == "GGA")
return parseGPGGA(str);
if (s == "GSA")
return parseGPGSA(str);
// ... etc
}
else if (std::string_view{str + 1, 5} == "HCHDG")
{
return parseHCHDG(str);
}
return false;
There's no extra strings being constructed either, so it should be at least as efficient.

Concise way to say equal to set of values in C++

For example I have the following string,
if (str[i] == '(' ||
str[i] == ')' ||
str[i] == '+' ||
str[i] == '-' ||
str[i] == '/' ||
str[i] == '*')
My question is there a concise way to say if this value one of these set of values in c++?
You can search for single character str[i] in a string with your special characters:
std::string("()+-/*").find(str[i]) != std::string::npos
Not glorious because it is C instead of C++, but the C standard library is always accessible from C++ code, and my first idea as an old dinosaur would be:
if (strchr("()+-/*", str[i]) != NULL)
Simple and compact
You may use the following:
const char s[] = "()+-/*";
if (std::any_of(std::begin(s), std::end(s), [&](char c){ return c == str[i]})) {
// ...
}
It really depends on your application actually. For such a small check and depending the context, one acceptable option could be to use a macro
#include <iostream>
#define IS_DELIMITER(c) ((c == '(') || \
(c == ')') || \
(c == '+') || \
(c == '-') || \
(c == '/') || \
(c == '*') )
int main(void)
{
std::string s("TEST(a*b)");
for(int i = 0; i < s.size(); i ++)
std::cout << "s[" << i << "] = " << s[i] << " => "
<< (IS_DELIMITER(s[i]) ? "Y" : "N") << std::endl;
return 0;
}
A more C++ish way of doing it would be to use an inline function
inline bool isDelimiter(const char & c)
{
return ((c == '(') || (c == ')') || (c == '+') ||
(c == '-') || (c == '/') || (c == '*') );
}
This post might be interesting then : Inline functions vs Preprocessor macros
Maybe not "more concise", but I think this style is succinct and expressive at the point of the test.
Of course is_arithmetic_punctuation needn't be a lambda if you're going to use it more than once. It could be a function or a function object.
auto is_arithmetic_punctuation = [](char c)
{
switch(c)
{
case '(':
case ')':
case '+':
case '-':
case '/':
case '*':
return true;
default:
return false;
}
};
if (is_arithmetic_punctuation(str[i]))
{
// ...
}

Find the first printf format sequence in a C++ string

I search the most concise and efficient way to find the first printf format sequence (conversion specification) in a C++ string (I cannot use std::regex as they are not yet implement in most in compilers).
So the problem is to write an optimized function that will return the beginning of the first printf-format sequence pos and its length n from an input string str:
inline void detect(const std::string& str, int& pos, int& n);
For example, for:
%d -> pos = 0 and n = 2
the answer is: %05d -> pos = 15 and n = 4
the answer is: %% %4.2f haha -> pos = 18 and n = 5
How to do that (clever and tricky ways are welcome)?
Scan forward for %, then parse the content from there. There are some quirky ones, but not THAT bad (not sure you want to make it an inline tho').
General principle (I'm just typing as I go along, so probably not the BEST form of code ever written - and I haven't tried to compile it at all).
inline void detect(const std::string& str, int& pos, int& n)
{
std::string::size_type last_pos = 0;
for(;;)
{
last_pos = str.find('%', last_pos)
if (last_pos == std::string::npos)
break; // Not found anythin.
if (last_pos == str.length()-1)
break; // Found stray '%' at the end of the string.
char ch = str[last_pos+1];
if (ch == '%') // double percent -> escaped %. Go on for next.
{
last_pos += 2;
continue;
}
pos = last_pos;
do
{
if (isdigit(ch)) || ch == '.' || ch == '-' || ch == '*' ||
ch == '+' || ch == 'l' || ch == 'L' || ch == 'z' ||
ch == 'h' || ch == 't' || ch == 'j' || ch == ' ' ||
ch == '#' || ch == '\'')
{
last_pos++;
ch = str[last_pos+1];
}
else
{
// The below string may need appending to depending on version
// of printf.
if (string("AacdeEfFgGiopusxX").find(ch) != std::string::npos)
{
// Do something about invalid string?
}
n = last_pos - pos;
return;
}
} while (last_pos < str.length());
}
}
edit2: This bit is probably better written as:
if (isdigit(ch)) || ch == '.' || ch == '-' || ch == '*' ||
ch == '+' || ch == 'l' || ch == 'L' || ch == 'z' ||
ch == 'h' || ch == 't' || ch == 'j' || ch == ' ' ||
ch == '#' || ch == '\'') ...
if (string("0123456789.-*+lLzhtj #'").find(ch) != std::string::npos) ...
Now, that's your homework done. please report back with what grade you get.
Edit: It should be noted that some things that a regular printf will "reject" is accepted by the above code, e.g. "%.......5......6f", "%5.8d", "%-5-6d" or "%-----09---5555555555555555llllld". If you want the code to reject these sort of things, it's not a huge amount of extra work, just need a little bit of logic to check "have we seen this character before" in the "check for special characters or digit", and in most cases the special character should only be allowed once. And as the comment says, I may have missed a couple of valid format specifiers. It gets further trickier if you also need to cope with "this 'l' is not allowed with 'c'" or such rules. But if the input isn't "malicious" (e.g. you want to annotate where on which line there are format specifiers in a working C source file), the above should work reasonably well.