Does substr change the position where the find function starts searching?

Does substr change the position where the find function starts searching? - c++

Does substr change the position where the find function starts searching ?
I have a char * named search_text containing the following text:
ABC_NAME = 'XYZSomeone' AND ABC_CLASS = 'XYZSomething'
I want to display the "ABC_NAME" value from that string.
Here is what I am doing:
std::cout << std::string(search_text).substr ( 12, std::string( search_text ).find ("'", 13 )-1) << std::endl;
My logic in the above in the substr is as follows:
The ABC_NAME value always begins at the 12th character, so start the substring there.
Do a find for the character ' (single quotation mark) from the 13th character onwards, starting from the 13th character (the second argument of the find() function). The resulting number will be the outer bound of the substr.
However, my code prints out the following:
XYZSomeone' AND ABC_C
However, when I try to display the value of the find() function directly, I do get the correct number for the location of the second ' (single quotation mark)
std::cout << std::string( search_text ).find ("'", 13 ) << std::endl;
This prints out:
22
So why is it that the substr is not finding the value of 22 as its second argument ?

It's a rather simple matter to evaluate your expression by hand, seeing how you already verified the result of find:
std::string(search_text).substr ( 12, std::string( search_text ).find ("'", 13 )-1)
std::string("ABC_NAME = 'XYZSomeone' AND ABC_CLASS = 'XYZSomething'").substr ( 12, 22-1)
Now check the documentation for substr: "Returns a substring [pos, pos+count)". The character at position 12 is the 'X' for the name portion, and the character at position 12+21 = 33 is the 'L' from the class portion. So we expect the substring starting at that 'X' and going up to just before that 'L', which is "XYZSomeone' AND ABC_C". Check.
(It is understandable to forget whether substr takes a length or a position at which to end. Different languages do disagree on this. Hence the link to the documentation.)
Unsolicited commentary
Trying to do so much in one line makes your code harder to read and harder to debug. In this case, it also hurts performance. There is no need to convert search_text to a std::string twice.
std::string search_string{search_text};
std::size_t found = search_string.find('\'', 12);
if ( found != std::string::npos )
found -= 12;
std::cout << search_string.substr(12, found) << std::endl;
This cuts the number of times a string is constructed (hence the times the string data is copied) from three to two.
If you are using C++17, you can improve the performance even more by constructing no strings. Just use std::string_view instead of std::string. For this scenario, it has the same member functions taking the same parameters; all you have to change is the type of search_string. This puts the performance on par with C code.
Even better: since string views are so cheap to create, you could even write your code – without a performance hit – so that it doesn't matter whether substr takes a length or takes the past-the-end position.
std::string_view search_string{search_text};
std::string_view ltrimmed = search_string.substr(12);
std::size_t found = ltrimmed.find('\'');
std::cout << ltrimmed.substr(0, found) << std::endl;
Constructive laziness FTW!

Related

Why is the length of a string off by one on the first read of a file?

I am perplexed with the way my program is performing. I am looping the following process:
1) take the name of a course from an input file
2) output the length of the name of the course
The problem is that the first value is always one less than the actual value of the string.
My first string contains 13 characters (including the colon), but nameOfClass.length() returns 12. The next string, the character count is 16 and indeed, nameOfClass.length() returns 16.
Every value after that also returns the expected value, it is only the first that returns the expected value minus 1.
Here's the (reduced) code:
std::ifstream inf("courseNames.txt");
int numberOfClasses = 10;
string nameOfClass;
for (int i = 0; i < numberOfClasses; i++) {
std::getline(inf, nameOfClass,':');
std::cout << nameOfClass.length() << "\n";
}
The file looks like this (courseNames.txt):
Pre-Calculus:
Public-Speaking:
English I:
Calculus I:
...etc. (6 more classes)
This is what I get:
12
16
10
11
Can anyone explain this behavior of the .length() function?

You have a problem, but you have the wrong conclusion. std::getline reads but doesn't output the delimiter, and so the first result is indeed 12.
It also doesn't output the delimiter for any subsequent lines, so why is there always one more? Well, look what is after that :. That's right, a new line!
Pre-Calculus:
^ a new line
So your nameOfClass variable, except for the first string, always stores an extra newline before the other characters.
The fix is easy enough, just ignore the newline after reading the string.
inf.ignore(); // ignore one character
So, not the first result was wrong, it was the only one right :)

Array conversion guidance

I'm stuck on an assignment which converts contents of an array (input from the user) to a pre-declared shorthand.
I want it to be as simple as strcpy(" and ", "+");
to change the word 'and' within a string, to a '+' sign.
Unfortunately, no matter how I structure the function; I get a deprecated conversion warning (variant loops, and direct applications, attempted).
Side note; this is assignment based, so my string shortcuts are severely limited, and no pointers (I've seen several versions of clearing the fault using them).
I'm not looking for someone to do my homework; just guidance on how strcpy can be applied without creating the dep. warning. Perhaps I shouldn't be using strcpy at all?

strcpy copies the contents of the second string into the memory of the first string. Since you're copying a string literal into a string literal it can't do it (you can't write to a string literal) and so it complains.
Instead you need to build your own search and replace system. You can use strstr() to search for a substring within a string, and it returns the pointer in memory to the start of that found string (if it's found).
Let's take the sample string Jack and Jill went up the hill.
char *andloc = strstr(buffer, " and ");
That would return the address of the start of the string (say 0x100) plus the offset of the word " and " (including spaces) within it (0x100 + 4) which would be 0x104.
Then, if found, you can replace it with the & symbol. However you can't use strcpy for that as it'll terminate the string. Instead you can set the bytes manually, or use memcpy:
if (andloc != NULL) { // it's been found
andloc[1] = '&';
andloc[2] = ' ';
}
or:
if (andloc != NULL) { // it's been found
memcpy(andloc, " & ", 3);
}
That would result in Jack & d Jill went up the hill. We're not quite there yet. Next you have to shuffle everything down to cover the "d " from the old " and ". For that you'd think you could now use strcpy or memcpy, however that's not possible - the strings (source and destination) overlap, and the manual pages for both specifically state that the strings must not overlap and to use memmove instead.
So you can move the contents of the string after the "d " to after the "& " instead:
memmove(andloc + 3, andloc + 5, strlen(andloc + 5) + 1);
Adding a number to a string like that adds to the address of the pointer. So we're looking at copying the data from 5 characters further on in the string that the old "and" location into a space starting at 3 characters on from the start of the old "and" location. The amount to copy is the length of the string from 5 characters on from the start of the "and" location plus one so it copies the NULL character at the end of the string.
Another manual way of doing it would be to iterate through each character until you find the end of the string:
char *to = andloc + 3;
char *from = andloc + 5;
while (*from) { // Until the end of the string
*to = *from; // Copy one character
to++; // Move to the ...
from++; // ... next character pair
}
*to = 0; // Add the end of string marker.
So now either way the string memory contains:
Jack & Jill went up the hill\0l\0
The \0 is the end of string marker, so the actual string "content" is only up as far as the first \0 and the l\0 is now ignored.
Note that this only works if you are replacing a part with something that is smaller. If you are replacing it with something bigger, so the string grows in size, you will be forced to use memmove, which first copies the content to a scratchpad, and ensure that your buffer has enough room in it to store the finished string (this kind of thing is often a big source of "buffer overruns" which are a security headache and one of the biggest causes of systems being hacked). Also you have to do the whole thing backwards - move the latter part of the string first to make room, then modify the gap between the two halves.

using size_t difference to copy a portion of a string

I am trying to iterate through a string and copy chunks of information based off of an initial key value and a key value that identifies the end of the chunk of info. However when I try to subtract my initial and final values to find the length of the chunk im looking for, I receive a seemingly arbitrary value.
So the start and end indicies are found by:
currentstringlocation = mystring.find("value_im_looking_to_start_at, 0);
endlocation = mystring.find("value_im_looking_to_stop_at", currentstringlocation);
I'm then trying to do something like:
mystring.copy(newstring,(endlocation-currentlocation), currentlocation);
This however isn't giving me the results I want. Here's an excerpt from my code and the output it yields.
stringlocation2=topoinfo.find("\n",stringlocation+11);
topoinfo.copy(address,(stringlocation2-stringlocation+11),stringlocation+11);
cout << (stringlocation2-stringlocation+11) << "\n";
cout << stringlocation2 << "\t" << stringlocation+11 << "\n";
output:
25
59 56
So clearly the chunk of info I'm trying to capture spans 3 characters, however when I subtract the two I get 25. Can someone explain to me why this happens and how I can work around it?

You are calculating the length wrong, try instead something like:
topoinfo.copy(address, stringlocation2 - (stringlocaion + 11),
stringlocation + 11);
After this, address will contain the copied string. Remember though: If address is a character array or a character pointer, then you should add the terminating '\0' character yourself!
A better solution to get a substring is to actually use the std::string::substr function:
std::string address = topoinfo.substr(stringlocation + 11,
stringlocation2 - (stringlocaion + 11));

Should be
topoinfo.copy(address,stringlocation2-(stringlocation+11),stringlocation+11);
cout << stringlocation2-(stringlocation+11) << "\n";
You got your brackets wrong.

Horspool algorithm for multiple occurrences of the same pattern

I've implemented in C++ the Horspool algorithm (depending on the Introduction to the Design and Analysis of Algorithms by Anany Levitin, 2nd edition, p. 258) for finding the position of the first occurrence of a desired pattern in the text. However, I want to extend the algorithm to find multiple occurrences of the same pattern. Unfortunately, I got stuck on the latter implementation. You can see my code below:
The function calculates and returns the position of the first occurrence of a desired pattern in the text. The shift sizes are stored in the ShiftTable and the ShiftTable is indexed by the characters of a desired alphabet. Additionally, the integer counter is used for counting the total comparisons between pattern's and text's characters. The counter initially has a zero value. How could I extend this to find multiple occurences of the same pattern?
I attempted the following in the body of the main() function but it's NOT EFFICIENT although it works. If the first occurrence of the pattern is encountered, its position will be printed and the part of the text which ends with the first occurrence of the pattern will be erased. Moreover, the programme will check the remaining text for the pattern and so on.
int counter=0;
while ((position = Find(pattern,text,ShiftTable,counter)) != -1) {
cout << position << endl;
text = text.erase(0,result+m);
}
Any ideas?

Currently you always start at the beginning (i = m - 1). If you want to resume a previous search, just pass in the last position to start from.
In the following I’ve removed the counter variable – what’s the use of that anyway?
int Find(string pattern, string text, int *ShiftTable, int start = 0)
… and …
i = start + m - 1,
… and just call the code as follows:
while ((position = Find(pattern,text,ShiftTable,position)) != -1) {
cout << position << endl;
++position;
}

Is the integer n in the string s? Is 4 in Ab56c4D? (C++)

Is there a standard function which will return a bool for this?
I'm writing a program that plays the game of life and the user enters two strings, S23 and B3 are examples. In my main loop I just want to check if an integer (the number of living surround cells) is in one of the strings.
Thanks for your help with this question. ;)

http://www.cplusplus.com/reference/string/string/find/
Searches the string for the content specified in either str, s or c, and returns the position of the first occurrence in the string.
Return Value:
The position of the first occurrence in the string of the searched content.
If the content is not found, the member value npos is returned.

First you need to get a string version of the integer value, then you can try to find it in the other string:
std::ostringstream oss;
oss << some_integer;
if (some_string.find(oss.str()) != std::string::npos)
// match...

Loop through the characters in the string and if the character ( (int)cur_char) ) is between 48 and 57 return true.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Does substr change the position where the find function starts searching? - c++

Related

Why is the length of a string off by one on the first read of a file?

Array conversion guidance

using size_t difference to copy a portion of a string

Horspool algorithm for multiple occurrences of the same pattern

Is the integer n in the string s? Is 4 in Ab56c4D? (C++)

Categories

Resources