C++ custom string trim implementation valgrind warning - c++

recently I implemented a custom function for trimming std::strings that removes whitespace character prefixes and suffixes.
I tested the functionality and it works according to my unit tests, but when run the tests using valgrind, I get the following output:
==4486== Conditional jump or move depends on uninitialised value(s)
==4486== at 0x415DDA: is_ws_char(char) (parse.cpp:22)
==4486== by 0x415BC6: parse::trim(std::string&) (parse.cpp:34)
My input test string was
string s(" a");
I do not see what is the problem here.
The code looks like this:
inline bool is_ws_char(const char c) { // this is line 22 in my code
return (c == ' ' || c == '\n' || c == '\t' || c == '\r');
}
void parse::trim(std::string& str) {
size_t len = str.size();
size_t i = 0;
for (; i < len; ++i)
if (!is_ws_char(str[i]))
break;
const size_t start = i;
for (i = len - 1; i >= 0; --i)
if (!is_ws_char(str[i])) // this is line 34 in my code
break;
const size_t end = i;
str = str.substr(start, end - start + 1);
}
Does anybody has an idea what is the problem here?
I briefly thought that is's just a valgrind oddity, but that seems to be rather unlikely.
Thanks in advance!

This loop is invalid
for (i = len - 1; i >= 0; --i)
The condition will be always equal to true because expression --i will be always >= 0 due to the fact that i is unsigned integer.
Also when str.size() is equal to zero then len - 1 will be equal to std::string::npos.

Related

C++: How to set loop variables and conditions for the end of the loop

Excuse me, I'm having trouble learning C++. Our teacher said in class that in C++ the best way to indicate the end of the loop is !=, not <=, but I don't understand why. I just encountered the following problem, can someone help me?
#include <iostream.h>
#include <string.h>
using namespace std;
int main() {
string s = "He123llo"; int count = 0;
for (string::size_type index = 0; s[index] != '\0'; ++index) { //(*)
char ch = s[index];
if (ch >= '0' && ch <= '9') count++;
}
cout << count << endl;
}
int another() {
string s = "He123l4l5o";
string new1;
for (string::size_type index = 0; index != s.size(); ++index) { //(*)
char ch = s[index];
if (ch >= '0' && ch <= '9')
new1 += ch;
}
return 0;
}
What's wrong with changing the statement in (*) to for (int index=0;index!=s.size();++index)?
What is the problem if the statement in (*) is modified to for (string::size_type index=0;index<=s.size();++index)?
Can the statement in (*) be modified to for (string::size_type index=0;s[index]!=’\0’;++index)?
In the above code, the length of the string has not changed. Now: In another() function, please re-answer questions 2 and 3.
Like arrays, strings are 0-indexed, meaning the valid character indexes are 0 to size-1.
If you use s[index] != '\0' or index <= s.size(), accessing the character at index s.size() is undefined behavior prior to C++11, and in C++11 and later it accesses the string's null terminator (technically, C++ strings are not null-terminated, but they include a null terminator for compatibility with C).
Using index != s.size() or index < s.size() is safe in all C++ versions.

Spinning words in C++ and recieving error

I'm trying to create a program according to the prompt below but I keep recieving a Caught std::exception, what(): basic_string::at: __n (which is 0) >= this->size() (which is 0) error, I though I was solid at C++ but I guess time takes its toll. My code is down below. Basically, first I parse the string by space character and save them in a vector<string> after that I check if a word is larger than 5 and reverse it if it is and do nothing if it is not. If it isn't the final word, I add a space at the end. Bing bang boom, prompt complete, or at least I thought.
std::string spinWords(const std::string &str)
{
std::vector<std::string> words;
std::string spinnedWord;
int count = 0;
for(unsigned int i = 0; i < str.length(); i++)
{
char currentChar = str.at(i);
if (currentChar == ' ')
{
count++;
continue;
}
else if((int)words.size() == count)
{
words.push_back(&currentChar);
}
else
{
words[count] += currentChar;
}
}
for(unsigned int i = 0; i < words.size(); i++)
{
if(words[i].size() >= 5)
{
for (unsigned int j = words[i].length() - 1; j >= 0; j--)
{
spinnedWord += words[j].at(i);
}
}
if(i + 1 != words.size())
{
spinnedWord += ' ';
}
}
return spinnedWord;
}// spinWords
Write a function that takes in a string of one or more words, and
returns the same string, but with all five or more letter words
reversed (Just like the name of this Kata). Strings passed in will
consist of only letters and spaces. Spaces will be included only when
more than one word is present.
Edit1: I have changed words[j].at(i); to words[i].at(j);
and I have changed words.push_back(&currentChar); to words.push_back(std::string(1, currentChar));
From what I currently understand, when I was pushing back &currentChar, I was causing a undefined behavior. I'll look into how to avoid that in the future. However, the error from before is still present, so the question remains unanswered
for (unsigned int j = words[i].length() - 1; j >= 0; j--)
{
spinnedWord += words[j].at(i);
}
You swapped j an i here. It must be words[i].at(j). Also j probably shouldn't be unsigned here, because the loop condition j >= 0 is always true for unsigned integers.
EDIT: the UB concern for line words.push_back(&currentChar) is valid too. The way to fix it is to construct a string from a char explicitly:
words.push_back(std::string(1, currentChar));
words.push_back(&currentChar);
You're trying to construct a std::string from a pointer to a single character. This compiles because there is a matching constructor, but it takes a C-style string, which your pointer to a single character isn't.
Truly speaking I even do not understand your code. For example what the variable count is doing in the program. Or why you are using an additional container like std::vector when all can be and shall be done with an object of the type std::string because it has all resources to do the task.
The container std::vector is needed only if the assignment is to split a string into words and return the words in an object of the type std::vector<std::string>. But your task is entirely different.
Pay attention to that in general there can be more than one space between words. Even if it is not so in any case you shall use a general approach and not rely on that between words there is only one space.
Your function does not make sense for example when the original string starts from a space character. In this case count will be equal to 1 due to this if statement
if (currentChar == ' ')
{
count++;
continue;
}
but the size of the vector will be equal to 0, So as words.size() is not equal to count then the else statement will be executed
else if((int)words.size() == count)
{
words.push_back(&currentChar);
}
else
{
words[count] += currentChar;
}
that results in undefined behavior.
I can suggest the following solution. In the demonstrative program below I do not use the standard algorithm std::reverse because I think that you have to reverse a word by your own code.
Here you are.
#include <iostream>
#include <string>
#include <utility>
std::string spinWords( const std::string &s, std::string::size_type length = 5 )
{
std::string t( s );
const char *delim = " \t";
for ( std::string::size_type i = 0; i != t.size(); )
{
auto pos = t.find_first_not_of( delim, i );
if ( pos != std::string::npos )
{
i = t.find_first_of( delim, pos );
if ( i == std::string::npos ) i = t.size();
if ( length < i - pos )
{
auto n = i - pos;
for ( std::string::size_type j = 0; j < n / 2; j++ )
{
std::swap( t[pos + j], t[i - j - 1] );
}
}
}
else
{
i = t.size();
}
}
return t;
}
int main()
{
std::string s( "1 12 123 1234 12345 123456 1234567 123456789 1234567890" );
std::cout << s << '\n';
std::cout << spinWords( s ) << '\n';
return 0;
}
The program output is
1 12 123 1234 12345 123456 1234567 123456789 1234567890
1 12 123 1234 12345 654321 7654321 987654321 0987654321

What is the proper way of reading a composite key and a numeric value from an input file?

I'm trying to read a text file consisting of numerous strings which either represent a key/value (the key is a car number in a format of a letter/' '/3digits/' '/2letters; the value is unsigned long long; \t or ' ' between them) or an empty line, e.g.:
empty line
empty line
Z 999 ZZ 80
A 000 AA 124
Z 666 ZZ 42
I am using a cin.getline() function for that, reading a whole line and going through every character, saving a key and a value into an 'element' variable and pushing it into a vector afterwards. But for some reason the program seems to work unexpectedly, giving a weird output:
0
0
Z 999 ZZP 80
A 000 AA| 124
Z 666 ZZ* 42
So far I have been trying to analyse what could go wrong but I just can't see it. I've also tried using other tools like scanf() or cin.get() but failed miserably. Can someone please explain to me why this is happening and maybe show a more correct way of solving this task? Here is the code:
struct kv {
char key[8];
unsigned long long val;
};
int main()
{
std::vector<kv> data_vector;
kv element;
char str[64] = {};
char num[32] = {};
while (std::cin.getline(str, 64)) {
if (str[0] == ' ' || str[0] == '\n' || str[0] == '\t' || str[0] == EOF) {
continue;
}
size_t i = 0, n = 0;
for (i = 0; i < 8; i++)
element.key[i] = str[i];
while (!(str[i] >= '0' && str[i] <= '9'))
i++;
while (str[i] >= '0' && str[i] <= '9')
num[n++] = str[i++];
element.val = atoi(num);
data_vector.push_back(element);
for (n = 0; n < 32; n++)
num[n] = 0;
for (i = 0; i < 64; i++)
str[i] = 0;
}
for (size_t i = 0; i < data_vector.size(); i++) {
std::cout << data_vector[i].key << "\t" << data_vector[i].val << std::endl;
}
return 0;
}
EDIT: as #JimRhodes pointed out, changing char key[8] to char key[9] and adding element.key[8] = '\0' helped, but empty lines are still being processed the wrong way (as they should be ignored), giving an output of 0.
I think you may not be understanding how std::cin.getline() works. First of all, you do not want to test the return value of std::cin.getline() for true or false. You need to check for eof or fail. Secondly, std::cin.getline() discards the newline character so there is no need to check for '\n'. Your loop could start like this:
for ( ; ; )
{
str[0] = '\0'; // Clear any previous data
std::cin.getline(str, 64);
if ( std::cin.eof() )
{
break; // No more data, exit loop
}
if ( std::cin.fail() || (str[0] < 'A') )
{
continue; // Empty line or line does not start with a letter
}
. . .

How can I fix this c++ palindrome code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
So I'm trying to write a c++ palindrome program. I've come up with two functions so far.
void isPal(string str)
{
int a = 0, b = str.length();
string checker1 = "", checker2 = "";
for (; a != str.length(); a++)
checker1 += str[a];
for (; b >= 0; b--)
checker2 += str[b];
cout << checker1 << " " << checker2 << endl;
if (checker1 == checker2)
cout << "Palindrome baby!" << endl;
if (checker1 != checker2)
cout << "Not palindrome!" << endl;
}
bool isit(string str)
{
int x = str.length(), counter = 0;
if (str.length() <= 1)
return true;
else
{
while (counter != str.length())
{
string strNew = str.erase(0, 1);
strNew = strNew.erase(strNew.length() - 1);
string strNewer = str.replace(1, x, strNew);
return str[0] == str[str.length()] && isit(strNewer);
counter++;
}
}
}
Why does the first function always returns the "Not palindrome!" if-statement?
I'll admit that the second is a mess. I'm not even sure I completely understand my thinking when I wrote it. My intentions was to come up with a similar answer to the recursive Python palindrome code.
In python the inductive case was simply
return str[0] == str[-1] and isit( str[1:-1] )
How can I write an inductive c++ palindrome code?
Update: -4 for a beginner's question!! really ? :)
The simplest way to check whether a string is palindrome is to write
if ( s == std::string( s.rbegin(), s.rend() ) ) std::cout << "The string is palimdrome." << std::endl;
As for your approach with recursion then the function could look the following way
bool isPal( const std::string &s )
{
return ( s.size() < 2 || ( s[0] == s[s.size() - 1] && isPal( s.substr( 1, s.size() - 2 ) ) ) );
}
You have a lot of problems in both of your functions. To test for a palindrome, you only need to loop through the string once:
bool isPalindrome(const std::string& s)
{
for (int i = 0; i < s.length() / 2; ++i)
{
if (s[i] != s[s.length() - 1 - i])
{
return false;
}
}
return true; // if every element has a mirror, it is a palindrome
}
In your second version:
return str[0] == str[str.length()] && isit(strNewer);
counter++;
The second line will never get executed.
Writing a recursive version of the function would be a waste, but would require either copying the substrings, or providing an index to the function:
Copy Version
bool isPalindrome(const std::string& s)
{
if (s.length() <= 1)
return true;
if (s[0] != s[s.length() - 1]) // or s.front() != s.back() in C++11
return false;
std::string t = s.substr(1, s.length() - 2));
return isPalindrome(t);
}
Indexed Version
bool isPalindrome(const std::string& s, int index = 0)
{
if (index <= s.length() / 2)
{
return s[index] == s[s.length() - 1 - index] && isPalindrome(s, ++index);
}
return true;
}
There are two key things you should know about how strings work:
Every quoted string, aka "c-string" or "string literal", consists of an array of 0 or more char followed by a null character '\0', e.g. the string "bar" has three printable characters ( 'b', 'a', & 'r' ) and one non-printing character '\0' (the null character). So, str.c_str() would be equivalent to char str[4] = {'b','a','r','\0'} in this case.
The length() of a string is equal to the number of printable characters.
The first iterator for (; a != str.length(); a++) counts from a=0 up to a=(str.length()-1) (as it should). That means that only the characters of the string (and not the null-terminus) will be appended by your checker1 += str[a]; line. Internally, a '\0' will be automatically put at the end.
The second iterator -- for (; b >= 0; b--) -- is does not properly mirror the first. It counts from b=str.length() down to b=0. Following from points 1 and 2, it should not be obvious that the character at str[str.length()] will always be '\0'. That means that your appended string, checker2 is first assigned the null-terminator, and then the rest of the string. That means that when you check the two for inequality, is will appear as if it were a zero-length string, i.e. "". As such, it will always evaluate as != for any non-zero-length str, and your function will always print "Not palindrome!".

pygtkscintilla auto indent

I'm trying to translate the c++ code and i can't work out what "char linebuf[1000]" is, can some kind sole translate this to python or explain what linebuf is. Thanks! :) Taken from http://www.scintilla.org/ScintillaUsage.html
if (ch == '\r' || ch == '\n') {
char linebuf[1000];
int curLine = GetCurrentLineNumber();
int lineLength = SendEditor(SCI_LINELENGTH, curLine);
//Platform::DebugPrintf("[CR] %d len = %d\n", curLine, lineLength);
if (curLine > 0 && lineLength <= 2) {
int prevLineLength = SendEditor(SCI_LINELENGTH, curLine - 1);
if (prevLineLength < sizeof(linebuf)) {
WORD buflen = sizeof(linebuf);
memcpy(linebuf, &buflen, sizeof(buflen));
SendEditor(EM_GETLINE, curLine - 1,
reinterpret_cast<LPARAM>(static_cast<char *>(linebuf)));
linebuf[prevLineLength] = '\0';
for (int pos = 0; linebuf[pos]; pos++) {
if (linebuf[pos] != ' ' && linebuf[pos] != '\t')
linebuf[pos] = '\0';
}
SendEditor(EM_REPLACESEL, 0, reinterpret_cast<LPARAM>(static_cast<char *>(linebuf)));
}
}
It is a buffer for a line of input text, of type char[1000], i.e. an array of 1000 char elements (which are actually bytes, because C++ is based upon C, which in turn predates the whole idea of character encodings).
If we really wanted a literal translation of the algorithm, the closest fit in Python is probably something like array.array('B', [0]*1000). However, this initializes the Python array, whereas the C++ array is uninitialized - there is really no way to skip that initialization in C++; it just reserves space without paying any attention to what's already in that chunk of memory.