Removing consecutive duplicate characters from a std::string

Removing consecutive duplicate characters from a std::string - c++

I'm currently trying to remove duplicate characters. For example:
maaaaaaa becomes ma
aaaaassssdddddd becomes asd
I have written the following piece of code:
string.erase(remove(string.find_first_of(string[i]) + 1, string.end(), string[i]), string.end());
but apparently std::string returns a pointer to the last + 1 character of the string, rather than the size, any ideas how I could remove string[i] from my string starting from the position next to that char?

string.find_first_of returns an integer position (and string::npos if not found). This is not compatible withstd::remove, which expects iterators. You can convert from a position to an iterator by adding the position to the begin iterator.
char to_remove = string[i];
auto beg = string.begin() + string.find_first_of(to_remove) + 1;
auto new_end = std::remove(beg, string.end(), to_remove);
string.erase(new_end, string.end());

Related

Find substring between two indices in C++

I want to find the substring between two indices. The substr(start_index, number_of_characters) function in C++ returns substring based on number of characters. Hence a small hack to use it with start and end indices is as follows:
// extract 'go' from 'iamgoodhere'
string s = "iamgoodhere";
int start = 3, end = 4;
cout<<s.substr(start,end-start+1); // go
What other methods exist in C++ to get the substring between two indices?

You can do this:
std::string(&s[start], &s[end+1])
or this:
std::string(s.c_str() + start, s.c_str() + end + 1)
or this:
std::string(s.begin() + start, s.begin() + end + 1)
These approaches require that end is less than s.size(), whereas substr() does not require that.
Don't complain about the +1--ranges in C++ are always specified as inclusive begin and exclusive end.

In addition to John Zwinck's answer you can use substr in combination with std::distance:
auto size = std::distance(itStart, itEnd);
std::string newStr = myStr.subStr(itStart, size);

Here is one solution,
std::string distance_finder(std::string str, int start, int end)
{
return str.substr(start, end - start);
}
Though, end always has to be grater than start.

Initial Check to see if a Substring is within range

I am in the beginnings of learning C++ and I am wondering if there is a way to assert that a substring can be created from a String, given a range. My String will vary in size each iteration. I am trying to create six substrings from that original String. With this variation in size, I am sometimes trying to access indexes of the String that do not exist for that particular iteration.
For example, if my String in iteration 1 is 11 characters
My first substring is from 3 characters - valid
My second substring is the next 3 characters - valid
My third substring is the next 5 characters - valid
My fourth substring is the next 4 characters - not valid - crashes program
My fifth substring - not valid, out of range
My sixth substring - not valid, out of range
I am wondering if there is a small check I can do to assert the length is valid. It's worth noting, I suppose, that I have not set any default values to these substrings. They are declared as:
string subS1
string subS2
string subS3
...
...
string subS6
Would setting all 6 substrings to null upon declaration alleviate this issue and for any valid substring, the value will just be overwritten?
Thanks in advance
subS1 = str.substr(0, 3); // Could be valid range
subS2 = str.substr(3, 3); // Could be valid range
subS3 = str.substr(6, 5); // Could be valid range
subS4 = str.substr(11, 4); // Could be valid range
subS5 = str.substr(15, 4); // Could be valid range
subS6 = str.substr(19); // from the nineteenth character to the end

Algorithm--->
step 1: Get the length of string in current iteration in variable size.
step 2: Write this code in itertaion.
int i=0;
i= str.substr(start,end).length();
if( i>size) then,
std::cout<<"index exceeded";

Either check the size of str before extracting the string, or rely on std::string::substr's len parameter:
Number of characters to include in the substring (if the string is
shorter, as many characters as possible are used). A value of
string::npos indicates all characters until the end of the string.
Example:
#include <iostream>
#include <string>
int main ()
{
std::string str="Alexander the Great";
std::string str2 = str.substr (16, 25);
std::cout << str2 << '\n'; // eat
return 0;
}
It won't crash, it will just use as many characters as possible.
This however:
std::string str2 = str.substr (20, 25);
should crash, so do it like this in this case:
std::string str2 = ""; // initialise to empty string, that is the NULL you are saying I guess
if(str.size() > 20)
str2 = str.substr (20, 25);
// 'str2' will be an empty string

C++ substring why does it do that?

I am trying to reverse order of a string 5 characters from end of string to beginning For example, if the input was "11111000002222233333", I want output to be "33333222220000011111"
string reverse(string str)
{
string tmp = "";
for(int i = str.length(); i >= 5; i = i - 5)
{
tmp.append(str.substr(i - 5, i));
}
return tmp;
};
lets just say that my input was "1000001010000000000010100" but it returns "101000000010100000000000010100010100000010000"

According to documentation:
Returns a substring [pos, pos+count). If the requested substring
extends past the end of the string, or if count == npos, the returned
substring is [pos, size()).
std::string substr(size_type pos = 0, size_type count = npos) const;
Your mistake is that you treat substr() second parameter as position index.
But it's substring length, not end position.
Quick fix is tmp.append(str.substr(i - 5, 5));

How to use std::stod properly

I am working on writing a simple linear line calculator. For example, a user can enter two equations (strings) such as y=5x+3 and y=-3x+6. The most basic feature of this calculator is that it will return the intersection point of these two lines.
The obstacle I can't seem to figure out is how to parse the string into two pieces of data: the slope, and the y-intercept. This is a simple calculator, so the format of both lines will be y=mx+b, however, both the slope and/or y-intercept may be non-integer numbers (i.e. floats).
I came across a function in the string library called stod, which converts a number in a string to a numerical value (am I understanding this correctly?).
http://www.cplusplus.com/reference/string/stod/
My question is, will this function do the job? If so, how exactly do I use the "idx" parameter? I don't quite understand it.
If this isn't going to work, how can I parse this user-entered data?
both equations are strings (y=mx+b)
m and b have private variables dedicated in storing the decimal value (i.e. double m_ and double b_ are private member variables)

This is how the idx parameter works:
#include <string>
#include <iostream>
int main(void)
{
std::string data = "y=5.9568x+3.14"; //say you have a string like this..
double y, x, m, b;
y = 0;
x = 0;
std::size_t offset = 0; //offset will be set to the length of characters of the "value" - 1.
m = std::stod(&data[2], &offset); //So we want to get the value "5.9568
b = std::stod(&data[offset + 3]); //When we reach this line, offset has a value of 6
std::cout<<b;
return 0;
}
So now you're asking why does it have a value of 6? Well because:
5.9568 is exactly: 6 characters in length. Thus on the next line when we do
b = std::stod(&data[offset + 3]);
we are actually feeding it a pointer to address of x + 3.. and that turns out to be right at the beginning of the 3.14.
In other words it's equivalent to:
std::stod(&data[9]);
So that idx parameter is actually the index/length of the double in characters within the string. If the string is:
str = "3.14159"
Then std::stod(str, &idx) will make idx equal to: 6.
if the string is:
str = "y = 1024.789" then std::stod(&str[4], &idx) will make idx equal to: 8 STARTING FROM &str[4]..

Here's something simple with no error checking to get you started:
Assuming your input string is always exactly of the form y=mx+b and you wish to parse it to obtain the numerical values of m and b you can first tokenize the string with y, =, x, and as delimiters.
An example of a tokenizing function can be found here. Here it is reproduced:
void tokenize(const std::string &str,
std::vector<std::string> &tokens,
const std::string &delimiters)
{
// Skip delimiters at beginning.
std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first "non-delimiter".
std::string::size_type pos = str.find_first_of(delimiters, lastPos);
while (std::string::npos != pos || std::string::npos != lastPos)
{
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters. Note the "not_of"
lastPos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, lastPos);
}
}
The first argument is the string to tokenize, the second is a reference to a vector<string> which the function will put the tokens in, and the third argument is a string containing all the delimiter characters. You can use it with the delimiters mentioned above like this:
string s = "y=-3x + 10";
vector<string> tokens;
tokenize(s, tokens, "y=x ");
For the example string above tokens will contain the following strings: -3, +, and 10.
Now you can iterate over tokens and call stod() on each token. You can put the results of stod() in a vector<double>:
vector<double> doubles;
for (vector<string>::iterator iter = tokens.begin(); iter != tokens.end(); ++iter) {
try {
doubles.push_back(stod(*iter)); // size_t* idx is an optional argument
} catch (...) {
// handle exceptions here. stod() will throw an exception
// on the "+" token but you can throw it away
}
}
Now doubles should have exactly 2 elements -- one for the slope and another for the intercept. Assuming the slope came first (the string was of the form y=mx+b instead of y=b+mx) then you can extract them from doubles:
double m = doubles[0];
double b = doubles[1];
Parsing the initial string is more complicated if the user is allowed different forms like y=b+mx (in that case the intercept came first), and much more complicated if the user can enter even stranger (but valid) forms like x*m+b=y (now you can't just assume that the number before the x character is the slope). It's not clear from your question exactly what alternate forms are considered valid, but nonetheless this should get you started.
Finally, as to your question about *idx, stod() puts into it the position of the first character after the number it parsed. This allows you to easily parse multiple numbers in a single string by skipping the number that was just parsed. Using the example at your reference link with some added comments:
std::string orbits ("365.24 29.53");
std::string::size_type sz; // alias of size_t
double earth = std::stod (orbits,&sz);
// sz now holds the position of the first character after 365.24, which is whitespace
// the next call to stod() will start from the sz position
double moon = std::stod (orbits.substr(sz));

c++ string member function substr usage

Please tell me if I am understanding the the substr member function correctly?
result = result.substr(0, pos) + result.substr(pos + 1);
It takes the string from pos, 0 until (but not including), remove[i]
and then + result.substr(pos + 1); concatenates the rest of the string, except but not including the string / char in remove?
string removeLetters2(string text, string remove)
{
int pos;
string result = text;
for (int i = 0; i < remove.length(); i++)
{
while (true)
{
pos = result.find(remove[i]);
if (pos == string::npos)
{
break;
}
else
{
result = result.substr(0, pos) +
result.substr(pos + 1);
}
}
}
return result;
}

In short, you are asking if
result = result.substr(0, pos) +
result.substr(pos + 1);
removes the character at position pos, right?
Short Answer:
Yes.
Longer Answer:
The two-argument call takes the start index and the length (the one argument call goes to the end of string).
It helps to imagine the string like this:
F o o / B a r
0 1 2 3 4 5 6 <- indices
Now remove /:
F o o / B a r
0 1 2 3 4 5 6 <- indices
1 2 3 | <- 1st length
| 1 2 3 <- 2nd length
result = result.substr(0, 3) <- from index 0 with length 3
+ result.substr(4); <- from index 4 to end
As a programmer, always be aware of the difference between distance/index and length.
Better: If index is known:
Your code creates two new, temporary strings, which are then concatenated into a third temporary string, which is then copied to result.
It would be better to ask string to erase (wink wink) in place:
result.erase(pos,1);
// or by iterator
string::iterator it = ....;
result.erase(it,it+1);
This leaves more optimization freedom to the string implementer, who may choose to just move all characters after pos by one to the left. This could, in a specialized scenario, be implemented with a single assignment, a single loop, and within the loop with the x86 swap instruction.
Better: If characters to be deleted are known:
Or, but I am not sure if this gives better performance, but it may give better code, the algorithm remove_if:
#include <algorithm>
// this would remove all slashes, question marks and dots
....
std::string foobar = "ab/d?...";
std::remove_if (foobar.begin(), foobar.end(), [](char c) {
return c=='/' || c=='?' || '.';
});
remove_if accepts any function object.
If there is just one character, it gets easier:
// this would remove all slashes
std::remove (foobar.begin(), foobar.end(), '/');

Although the answer to your question is "yes", there is a better way to go about what you are trying to do. Use string::erase, like this:
result.erase(pos, 1);
This API is designed for removal of characters from the string; it achieves the same result much more efficiently.

Yes, this function removes all letters in remove from text.

since you seem to delete more than one type of character have a look at remove_if from <algorithm> with a special predicate too, although the response of dasblinkenlignt is the good one

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Removing consecutive duplicate characters from a std::string - c++

Related

Find substring between two indices in C++

Initial Check to see if a Substring is within range

C++ substring why does it do that?

How to use std::stod properly

c++ string member function substr usage

Categories

Resources