Why does this string search and replace segfault? - c++

colours is a std::map<string, string> where the first element of each pair is a 2 letter std::string colour code, and the second element is the 7 character std::string shell escape code for that colour.
size_t i;
for(map<string, string>::iterator iter = colours.begin(); iter != colours.end(); iter++) {
while((i = text.find(iter->first)) != string::npos) {
text.replace(i, i + sizeof(iter->first), iter->second);
}
}
When this code is run, the program segfaults. My best guess is that it is something to do with the length of the replacement string being longer than the length of the string it's replacing, but to the best of my knowledge that can only cause segfaults with char *, not std::string.

string::replace takes a start position and a length, not a start position and an end position. Also sizeof does not return the length of a string. So the third line should probably read:
text.replace(i, iter->first.length(), iter->second);
For what it's worth, sizeof returns the amount of stack space consumed by the object, which is known at compile time. So it cannot vary from call to call, which the string's length does.

#luqui already answered why it segfaults, but I wanted to add something else.
Be careful not to end up in an infinite loop in the case where you your replacement string contains the search string. You can avoid this by passing string::find() a position to start searching. This will also improve efficiency as your current solution could degenerate to O(N2).

Related

How to find a character in a string?

I have already checked topics similar to this one but no one has been able to solve this problem.
So, I have to look for a character inside a string but it doesn't seem to work.
if (tracciatonuovos.find('T'))
{
nterminale++;
}
The counter does not increase. But if I try to find an empty space, it counts for me, and yet the string is full
First value is string, second is length of string, and third is the value of counter "nterminale".
use the find function from the std::string class
std::string mystr = "Some String with T";
size_t apos = mystr.find("T");
Read more about it here
If you want to find the first occurrence use :
find_first_of
And if you want to repeatedly find all occurrences of a specific character you will also need to specify a search start position and will need to write a loop say something like :
size_t pos = 0;
while((pos = mystr.find(whatever, pos)) != std::string::npos)
{
pos +=1;
// and your other logic here
}

Given an Array of strings how do I Remove Duplicates?

I would like to know how to remove duplicate strings from a container, but ignore word differences from trailing punctuation.
For example given these strings:
Why do do we we here here?
I would like to get this output:
Why do we here?
The algorithm:
While Reading a word is successful, do:
If End of file, quit.
If word list is empty, push back word.
else begin
Search word list for the word.
if word doesn't exist, push back the word.
end else (step 4)
end (while reading a word)
Use std::string for your word.
This allows you to do the following:
std::string word;
while (data_file >> word)
{
}
Use std::vector to contain your words (although you could use std::list as well). The std::vector grows dynamically so you don't have to worry about reallocation if you picked the wrong size.
To append to std::vector, use the push_back method.
To compare std::string, use operator==:
std::string new_word;
std::vector<std::string> word_list;
//...
if (word_list[index] == new_word)
{
continue;
}
So you have said you know how to tokenize a string. (If you don't spend some time here: https://stackoverflow.com/a/38595708/2642059) So I'm going to assume that we're given a vector<string> foo which contains words with possibly trailing punctuation.
for(auto it = cbegin(foo); it != cend(foo); ++it) {
if(none_of(next(it), cend(foo), [&](const auto& i) {
const auto finish = mismatch(cbegin(*it), cend(*it), cbegin(i), cend(i));
return (finish.first == cend(*it) || !isalnum(*finish.first)) && (finish.second == cend(i) || !isalnum(*finish.second));
})) {
cout << *it << ' ';
}
}
Live Example
It's worth noting here that you haven't given us rules on how to handle words like: "down", "down-vote", and "downvote" This algorithm presumes that the 1st 2 are equal. You also haven't given us rules for how to handle: "Why do, do we we here, here?" This algorithm always returns the final repetition, so the output would be "Why do we here?"
If the presumptions made by this algorithm are not totally to your liking leave me a comment and we'll work on getting you comfortable with this code to where you can make the adjustments that you need.

How can I reach the second word in a string?

I'm new here and this is my first question, so don't be too harsh :]
I'm trying to reverse a sentence, i.e. every word separately.
The problem is that I just can't reach the second word, or even reach the ending of a 1-word sentence. What is wrong?
char* reverse(char* string)
{
int i = 0;
char str[80];
while (*string)
str[i++] = *string++;
str[i] = '\0'; //null char in the end
char temp;
int wStart = 0, wEnd = 0, ending = 0; //wordStart, wordEnd, sentence ending
while (str[ending]) /*####This part just won't stop####*/
{
//skip spaces
while (str[wStart] == ' ')
wStart++; //wStart - word start
//for each word
wEnd = wStart;
while (str[wEnd] != ' ' && str[wEnd])
wEnd++; //wEnd - word ending
ending = wEnd; //for sentence ending
for (int k = 0; k < (wStart + wEnd) / 2; k++) //reverse
{
temp = str[wStart];
str[wStart++] = str[wEnd];
str[wEnd--] = temp;
}
}
return str;
}
Your code is somewhat unidiomatic for C++ in that it doesn't actually make use of a lot of common and convenient C++ facilities. In your case, you could benefit from
std::string which takes care of maintaining a buffer big enough to accomodate your string data.
std::istringstream which can easily split a string into spaces for you.
std::reverse which can reverse a sequence of items.
Here's an alternative version which uses these facilities:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <vector>
std::string reverse( const std::string &s )
{
// Split the string on spaces by iterating over the stream
// elements and inserting them into the 'words' vector'.
std::vector<std::string> words;
std::istringstream stream( s );
std::copy(
std::istream_iterator<std::string>( stream ),
std::istream_iterator<std::string>(),
std::back_inserter( words )
);
// Reverse the words in the vector.
std::reverse( words.begin(), words.end() );
// Join the words again (inserting one space between two words)
std::ostringstream result;
std::copy(
words.begin(),
words.end(),
std::ostream_iterator<std::string>( result, " " )
);
return result.str();
}
At the end of the first word, after it's traversed, str[wEnd] is a space and
you remember this index when you assign ending = wEnd.
Immediately, you reverse the characters in the word. At that point,
str[ending] is not a space because you included that space in the
letter-reversal of the word.
Depending on whether there are extra
spaces between words in the rest of the input, execution varies from this point, but it does eventually end with
you reversing a word that ended at the null terminator on the string
because you end the loop that increments wEnd on that null terminator and
include it in the final word reversal.
The very next iteration walks off of
the initialized part of the input string and the execution is undetermined from there because, heck, who knows what's in that array (str is stack-allocated, so it's whatever's sitting around in the memory occupied by the stack at that point).
On top of all of that, you don't update wStart except in the reversal loop,
and it never moves to wEnd all the way (see the loop exit condition), so come to think of it, you're never getting past that first word. Assuming that was fixed, you'd still have the problem I outlined at first.
All this assumes that you didn't just send this function something longer than 80 characters and break it that way.
Oh, and as mentioned in one of the comments on the question, you're returning stack-allocated local storage, which isn't going to go anywhere good either.
Hoo, boy. Where to start?
In C++, use std::string instead of char* if you can.
char[80] is an overflow risk if string is input by a user; it should be dynamically allocated. Preferably by using std::string; otherwise use new / new[]. If you meant to use C, then malloc.
cmbasnett also pointed out that you can't actually return str (and get the expected results) if you declare / allocate it the way you did. Traditionally, you'd pass in a char* destination and not allocate anything in the function at all.
Set ending to wEnd + 1; wEnd points to the last non-null character of the string in question (eventually, if it works right), so in order for str[ending] to break out of the loop, you have to increment once to get to the null char. Disregard that, I misread the while loop.
It looks like you need to use ((wEnd - wStart) + 1), not (wStart + wEnd). Although you should really use something like while(wEnd > wStart) instead of a for loop in this context.
You also should be setting wStart = ending; or something before you leave the loop, because otherwise it's going to get stuck on the first word.

Cannot get second while to loop properly

I'm making a function that removes elements from a string. However, I cant seem to get both of my loops to work together. The first while loop works flawlessly. I looked into it and I believe it might be because when "find_last_of" isn't found, it still returns a value (which is throwing off my loop). I haven't been able to figure out how I can fix it. Thank you.
#include <iostream>
#include <string>
using namespace std;
string foo(string word) {
string compare = "!##$";
string alphabet = "abcdefghijklmnopqrstuvxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
while(word.find_first_of(compare) < word.find_first_of(alphabet)) {
int position = word.find_first_of(compare);
word = word.substr(++position);
}
while(word.find_last_of(compare) > word.find_last_of(alphabet)){
int size = word.length();
word = word.substr(0, --size);
}
return word;
}
int main() {
cout << foo("!!hi!!");
return 0;
}
I wrote it like this so compound words would not be affected. Desired result: "hi"
It's not entirely clear what you're trying to do, but how about replacing the second loop with this:
string::size_type p = word.find_last_not_of(compare);
if(p != string::npos)
word = word.substr(0, ++p);
It's not clear if you just want to trim certain characters from the front and back of word or if you want to remove every one of a certain set of characters from word no matter where they are. Based on the first sentence of your question, I'll assume you want to do the latter: remove all characters in compare from word.
A better strategy would be to more directly examine each character to see if it needs to be removed, and if so, do so, all in one pass through word. Since compare is quite short, something like this is probably good enough:
// Rewrite word by removing all characters in compare (and then erasing the
// leftover space, if any, at the end). See std::remove_if() docs.
word.erase(std::remove_if(word.begin(),
word.end(),
// Returns true if a character is to be removed.
[&](const char ch) {
return compare.find(ch) != compare.npos;
}),
word.end());
BTW, I'm not sure why there is both a compare and alphabet string in your example. It seems you would only need to define one or the other, and not both. A character is either one to keep or one to remove.

Instead of having different size_t variables, can I use just one for searching a std::string multiple times?

I am wondering if it is possible to cut down how many size_t variables I use here. Here is what I have:
std::size_t found, found2, found3, found4 /* etc */;
Each has its own string to find:
found1 = msg.find("string1");
found2 = msg.find("string2");
found3 = msg.find("string3");
found4 = msg.find("string4");
// etc
If the word is found, then it will discard and prevent the message to be shown:
if (found1 != std::string::npos)
{
SendMsg("You cannot say that word!");
}
I have else if statements until found21. I'd like to cut everything down in size, so it would be clean, but I don't have a clue how to do it. I would also like it to lowercase the word. I have never used tolower at all either, so I would appreciated it if someone would help me.
To lowercase a string, you can do
std::transform(msg.begin(), msg.end(), msg.begin(), std::tolower);
Transform takes a begin and end iterator as the first and second arguments, and for each element in that range, applies the fourth argument (a function) and assigns it to what the third iterator is pointing to and increments it. By passing msg.begin() as both the first and third arguments, it will assign the result of the function to what it passed to it. So transform will basically do this:
for (auto src = begin(msg), dst = begin(msg); src != end(msg); ++src, ++dst)
*dst = tolower(*src);
but using transform is so much nicer.
To check whether a string contains any of a list of substrings, you can use a for loop with a vector:
vector<string> bad_strings { "bad word 1", "bad word 2", "etc" };
for (auto i = begin(bad_strings); i != end(bad_strings); ++i)
if (msg.find(*i)) {
SendMsg("You cannot say that word!");
break; // stop when you find it matches even one bad string
}