Find substring between two indices in C++ - c++

I want to find the substring between two indices. The substr(start_index, number_of_characters) function in C++ returns substring based on number of characters. Hence a small hack to use it with start and end indices is as follows:
// extract 'go' from 'iamgoodhere'
string s = "iamgoodhere";
int start = 3, end = 4;
cout<<s.substr(start,end-start+1); // go
What other methods exist in C++ to get the substring between two indices?

You can do this:
std::string(&s[start], &s[end+1])
or this:
std::string(s.c_str() + start, s.c_str() + end + 1)
or this:
std::string(s.begin() + start, s.begin() + end + 1)
These approaches require that end is less than s.size(), whereas substr() does not require that.
Don't complain about the +1--ranges in C++ are always specified as inclusive begin and exclusive end.

In addition to John Zwinck's answer you can use substr in combination with std::distance:
auto size = std::distance(itStart, itEnd);
std::string newStr = myStr.subStr(itStart, size);

Here is one solution,
std::string distance_finder(std::string str, int start, int end)
{
return str.substr(start, end - start);
}
Though, end always has to be grater than start.

Related

Split the string in C++ using command std::string::substr

I'm able to get the first half of string:
insert1 = tCreatureOne.substr(0, (tCreatureOne.length) / 2
I don't know how to get the second half of the string
insert2 = tCreatureOne.substr((tCreatureOne.length) / 2), ?????)
Here is my code.
// Insert creature two in to the
//middle of creature one.Science!
// Hamster and Emu make a HamEmuster
std::string PerformScience(std::string tCreatureOne, std::string tCreatureTwo)
{
std::string insert1;
std::string insert2;
std::string insert3;
// first half : 0 to middle
insert1 = tCreatureOne.substr(0, (tCreatureOne.length) / 2);
// last half: from middle to the end
insert2 = tCreatureOne.substr((tCreatureOne.length) / 2), tCreatureOne.length);
insert3 = insert1 + tCreatureTwo + insert2;
return insert3;
Probably the most important developer skill is knowing how to do online research. A google search for "c++ substr" reveals this as the top result: http://www.cplusplus.com/reference/string/string/substr/
In the section describing parameters, len is described as follows:
Number of characters to include in the substring (if the string is shorter, as many characters as possible are used).
A value of string::npos indicates all characters until the end of the string.
So you could write:
insert2 = tCreatureOne.substr(tCreatureOne.length() / 2), std::string::npos);
However, note that substr is declared as follows:
string substr (size_t pos = 0, size_t len = npos) const;
Meaning len quite conveniently defaults to npos.
Therefore you could more simply write:
insert2 = tCreatureOne.substr(tCreatureOne.length() / 2));
However, even if substr didn't have such a convenient means of specifying 'the rest of the string', you could still have quite easily calculated it as follows:
int totalLength = tCreatureOne.length();
int firstLength = totalLength / 2;
int remainderLength = totalLength - firstLength;
//So...
insert2 = tCreatureOne.substr(tCreatureOne.length() / 2), remainderLength);
πάντα ῥεῖ is correct in their comment - to retrieve the second half of your string, you don't need to specify a second parameter (the end of the string):
insert2 = tCreatureOne.substr(tCreatureOne.length() / 2);
The line above will work perfectly fine. Also, since you're using std::string, remember to add the parentheses to the length() call.

Bounds of std::string::find_first_of

Suppose I have a string foo and I want to search for the second period, if any.
I'm using this code:
std::size_t start = foo.find_first_of('.');
if (start != std::string::npos){
std::size_t next = foo.find_first_of('.', start + 1);
/*and so on*/
I'm wondering if this is well-defined if the first period is at the end of the string.
I think it is since start + 1 will be on the null-terminator, so I'm not in any danger of accessing any memory I shouldn't.
Am I correct?
If the first dot is at the end of the string, it's at index size() - 1.
So then start + 1 == size(), meaning that find_first_of will look in the interval [size(), size()). This is an empty interval, so no memory accesses will be made at all.
There may well not be a null-terminator at that point. (The standard does not guarantee it: c_str() is required to add one if necessary).
But your code is fine in any case. The behaviour on setting a pointer to point to 1-past-an-array is well-defined, so it's permissible to call the function with start + 1 is start is the last character in your string. Internally, a dereference of that pointer will not take place you're outside the region that find_first_of will search.
The C++ Standard does not impose any restriction on the value of the second parameter.
The function tries to calculate an actual position xpos the following way
pos <= xpos and xpos < size()
If it is unable to find such a velue it returns std::string::npos
For example
std::string s( "AB" );
auto pos = s.find_first_of( "A", std::string::npos );
if ( pos == std::string::npos ) std::cout << "Not found" << std::endl;
The output is
Not found

How to use std::stod properly

I am working on writing a simple linear line calculator. For example, a user can enter two equations (strings) such as y=5x+3 and y=-3x+6. The most basic feature of this calculator is that it will return the intersection point of these two lines.
The obstacle I can't seem to figure out is how to parse the string into two pieces of data: the slope, and the y-intercept. This is a simple calculator, so the format of both lines will be y=mx+b, however, both the slope and/or y-intercept may be non-integer numbers (i.e. floats).
I came across a function in the string library called stod, which converts a number in a string to a numerical value (am I understanding this correctly?).
http://www.cplusplus.com/reference/string/stod/
My question is, will this function do the job? If so, how exactly do I use the "idx" parameter? I don't quite understand it.
If this isn't going to work, how can I parse this user-entered data?
both equations are strings (y=mx+b)
m and b have private variables dedicated in storing the decimal value (i.e. double m_ and double b_ are private member variables)
This is how the idx parameter works:
#include <string>
#include <iostream>
int main(void)
{
std::string data = "y=5.9568x+3.14"; //say you have a string like this..
double y, x, m, b;
y = 0;
x = 0;
std::size_t offset = 0; //offset will be set to the length of characters of the "value" - 1.
m = std::stod(&data[2], &offset); //So we want to get the value "5.9568
b = std::stod(&data[offset + 3]); //When we reach this line, offset has a value of 6
std::cout<<b;
return 0;
}
So now you're asking why does it have a value of 6? Well because:
5.9568 is exactly: 6 characters in length. Thus on the next line when we do
b = std::stod(&data[offset + 3]);
we are actually feeding it a pointer to address of x + 3.. and that turns out to be right at the beginning of the 3.14.
In other words it's equivalent to:
std::stod(&data[9]);
So that idx parameter is actually the index/length of the double in characters within the string. If the string is:
str = "3.14159"
Then std::stod(str, &idx) will make idx equal to: 6.
if the string is:
str = "y = 1024.789" then std::stod(&str[4], &idx) will make idx equal to: 8 STARTING FROM &str[4]..
Here's something simple with no error checking to get you started:
Assuming your input string is always exactly of the form y=mx+b and you wish to parse it to obtain the numerical values of m and b you can first tokenize the string with y, =, x, and as delimiters.
An example of a tokenizing function can be found here. Here it is reproduced:
void tokenize(const std::string &str,
std::vector<std::string> &tokens,
const std::string &delimiters)
{
// Skip delimiters at beginning.
std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first "non-delimiter".
std::string::size_type pos = str.find_first_of(delimiters, lastPos);
while (std::string::npos != pos || std::string::npos != lastPos)
{
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters. Note the "not_of"
lastPos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, lastPos);
}
}
The first argument is the string to tokenize, the second is a reference to a vector<string> which the function will put the tokens in, and the third argument is a string containing all the delimiter characters. You can use it with the delimiters mentioned above like this:
string s = "y=-3x + 10";
vector<string> tokens;
tokenize(s, tokens, "y=x ");
For the example string above tokens will contain the following strings: -3, +, and 10.
Now you can iterate over tokens and call stod() on each token. You can put the results of stod() in a vector<double>:
vector<double> doubles;
for (vector<string>::iterator iter = tokens.begin(); iter != tokens.end(); ++iter) {
try {
doubles.push_back(stod(*iter)); // size_t* idx is an optional argument
} catch (...) {
// handle exceptions here. stod() will throw an exception
// on the "+" token but you can throw it away
}
}
Now doubles should have exactly 2 elements -- one for the slope and another for the intercept. Assuming the slope came first (the string was of the form y=mx+b instead of y=b+mx) then you can extract them from doubles:
double m = doubles[0];
double b = doubles[1];
Parsing the initial string is more complicated if the user is allowed different forms like y=b+mx (in that case the intercept came first), and much more complicated if the user can enter even stranger (but valid) forms like x*m+b=y (now you can't just assume that the number before the x character is the slope). It's not clear from your question exactly what alternate forms are considered valid, but nonetheless this should get you started.
Finally, as to your question about *idx, stod() puts into it the position of the first character after the number it parsed. This allows you to easily parse multiple numbers in a single string by skipping the number that was just parsed. Using the example at your reference link with some added comments:
std::string orbits ("365.24 29.53");
std::string::size_type sz; // alias of size_t
double earth = std::stod (orbits,&sz);
// sz now holds the position of the first character after 365.24, which is whitespace
// the next call to stod() will start from the sz position
double moon = std::stod (orbits.substr(sz));

Removing consecutive duplicate characters from a std::string

I'm currently trying to remove duplicate characters. For example:
maaaaaaa becomes ma
aaaaassssdddddd becomes asd
I have written the following piece of code:
string.erase(remove(string.find_first_of(string[i]) + 1, string.end(), string[i]), string.end());
but apparently std::string returns a pointer to the last + 1 character of the string, rather than the size, any ideas how I could remove string[i] from my string starting from the position next to that char?
string.find_first_of returns an integer position (and string::npos if not found). This is not compatible withstd::remove, which expects iterators. You can convert from a position to an iterator by adding the position to the begin iterator.
char to_remove = string[i];
auto beg = string.begin() + string.find_first_of(to_remove) + 1;
auto new_end = std::remove(beg, string.end(), to_remove);
string.erase(new_end, string.end());

c++ string member function substr usage

Please tell me if I am understanding the the substr member function correctly?
result = result.substr(0, pos) + result.substr(pos + 1);
It takes the string from pos, 0 until (but not including), remove[i]
and then + result.substr(pos + 1); concatenates the rest of the string, except but not including the string / char in remove?
string removeLetters2(string text, string remove)
{
int pos;
string result = text;
for (int i = 0; i < remove.length(); i++)
{
while (true)
{
pos = result.find(remove[i]);
if (pos == string::npos)
{
break;
}
else
{
result = result.substr(0, pos) +
result.substr(pos + 1);
}
}
}
return result;
}
In short, you are asking if
result = result.substr(0, pos) +
result.substr(pos + 1);
removes the character at position pos, right?
Short Answer:
Yes.
Longer Answer:
The two-argument call takes the start index and the length (the one argument call goes to the end of string).
It helps to imagine the string like this:
F o o / B a r
0 1 2 3 4 5 6 <- indices
Now remove /:
F o o / B a r
0 1 2 3 4 5 6 <- indices
1 2 3 | <- 1st length
| 1 2 3 <- 2nd length
result = result.substr(0, 3) <- from index 0 with length 3
+ result.substr(4); <- from index 4 to end
As a programmer, always be aware of the difference between distance/index and length.
Better: If index is known:
Your code creates two new, temporary strings, which are then concatenated into a third temporary string, which is then copied to result.
It would be better to ask string to erase (wink wink) in place:
result.erase(pos,1);
// or by iterator
string::iterator it = ....;
result.erase(it,it+1);
This leaves more optimization freedom to the string implementer, who may choose to just move all characters after pos by one to the left. This could, in a specialized scenario, be implemented with a single assignment, a single loop, and within the loop with the x86 swap instruction.
Better: If characters to be deleted are known:
Or, but I am not sure if this gives better performance, but it may give better code, the algorithm remove_if:
#include <algorithm>
// this would remove all slashes, question marks and dots
....
std::string foobar = "ab/d?...";
std::remove_if (foobar.begin(), foobar.end(), [](char c) {
return c=='/' || c=='?' || '.';
});
remove_if accepts any function object.
If there is just one character, it gets easier:
// this would remove all slashes
std::remove (foobar.begin(), foobar.end(), '/');
Although the answer to your question is "yes", there is a better way to go about what you are trying to do. Use string::erase, like this:
result.erase(pos, 1);
This API is designed for removal of characters from the string; it achieves the same result much more efficiently.
Yes, this function removes all letters in remove from text.
since you seem to delete more than one type of character have a look at remove_if from <algorithm> with a special predicate too, although the response of dasblinkenlignt is the good one