Finding all wanted words in a string

Finding all wanted words in a string - c++

I have a string which is too long, I want to find and locate all of the wanted words. For example I want to find the locations of all "apple"s in the string. Can you tell me how I do that?
Thanks

Apply repeatedly std::string::find if you are using C++ strings, or std::strstr if you are using C strings; in both cases, at each iteration start to search n characters after the last match, where n is the length of your word.
std::string str="one apple two apples three apples";
std::string search="apple";
for(std::string::size_type pos=0; pos<str.size(); pos+=search.size())
{
pos=str.find(search, pos);
if(pos==std::string::npos)
break;
std::cout<<"Match found at: "<<pos<<std::endl;
}
(link)

Use a loop which repeatedly calls std::string::find; on each iteration, you start finding beyond your last hit:
std::vector<std::string::size_type> indicesOf( const std::string &s,
const std::string &needle )
{
std::vector<std::string::size_type> indices;
std::string::size_type p = 0;
while ( p < s.size() ) {
std::string::size_type q = s.find( needle, p );
if ( q == std::string::npos ) {
break;
}
indices.push_back( q );
p = q + needle.size(); // change needle.size() to 1 for overlapping matches
}
return indices;
}

void findApples(const char* someString)
{
const char* loc = NULL;
while ((loc = strstr(someString, "apple")) != NULL) {
// do something
someString = loc + strlen("apple");
}
}

Related

Replace a subString in an std::string but not all of them using C++

I'm new to C++, I know that my post can be found duplicate with other posts but what I want to do is that to replace a substring in a string but not all of them.
This is my find and replaces substring function, it's worked like the other replace function:
void findAndReplaceAll(std::string& data, std::string toSearch, std::string replaceStr)
{
//Get the first occurrence
size_t pos = data.find(toSearch);
//Repeat till end is reached
while (pos != std::string::npos)
{
//Replace this occurrence of Sub String
data.replace(pos, toSearch.size(), replaceStr);
//Get the next occurrence from the current position
pos = data.find(toSearch, pos + replaceStr.size());
}
}
My main function:
int main()
{
std::string format = "h 'o''cloch' a, zzzz";
findAndReplaceAll(format, "h", "%h");
return 0;
}
The output that I want is just to replace the first 'h' but not the second 'h' one.
"%h 'o''cloch' a,zzzz";

You can add an argument to your function that tells after how many characters you need to stop replacing substrings.
The function prototype would look something like this: void findAndReplaceAll(std::string& data, std::string toSearch, std::string replaceStr, int stopAfterXCharacters).
You would then need to change your while loop to stop when that certain amount of characters has been read.
Alternatively, you could have a function that only replaces a certain amount of substrings, and in your case your function would return if one substring has been changed.

You can write a separate function that replaces only one found string.
Here is a demonstrative program
#include <iostream>
#include <string>
bool findAndReplace( std::string &data,
const std::string &toSearch,
const std::string &replaceStr,
std::string::size_type pos = 0 )
{
bool success = pos < data.size();
if ( success )
{
success = ( pos = data.find( toSearch, pos ) ) != std::string::npos;
if ( success )
{
data.replace( pos, toSearch.size(), replaceStr );
}
}
return success;
}
int main()
{
std::string format = "h 'o''cloch' a, zzzz";
findAndReplace( format, "h", "%h" );
std::cout << "\"" << format << "\"\n";
return 0;
}
Its output is
"%h 'o''cloch' a, zzzz"

C++ function to replace in a string all occurrences of a given substring

I want a function that takes a string and replaces all occurrences of a given word with asterisks in place of its letters. I want to do this elegantly, like a real C++ programmer.
As an example,
int main()
{
std::string str = "crap this craping shit.";
censor_word("crap", str);
std::cout << str;
return 0;
}
should output
"**** this ****ing shit"
I need help coming up with an elegant way of filling in the following function:
void censor_word(const std::string& word, std::string& text)
{
...
}
I know the geniuses at Stack Overflow can probably come up with a 1-line solution.
My code looks yucky
void censor_word(const std::string& word, std::string& text)
{
int wordsize= word.size();
if (wordsize < text.size())
{
for (std::string::iterator it(text.begin()), endpos(text.size() - wordsize), int curpos = 0; it != endpos; ++it, ++curpos)
{
if (text.substr(curpos, wordsize) == word)
{
std::string repstr(wordsize, '*');
text.replace(curpos, wordsize, repstr);
}
}
}
}
Teach me how to do this the way that a C++ purist would do it.

for( auto pos = str.find( word ); pos != std::string::npos; pos = str.find( word ) )
{
str.replace( str.begin() + pos, str.begin() + pos + word.size(), word.size(),'*' );
}
We find the first appearance of the word we want replaced. We then replace it. We do this until there are no more appearances, as they have all been replaced.

How to find and replace all occurrences of a substring in a string?

I need to search a string and edit the formatting of it.
So far I can replace the first occurrence of the string, but I am unable to do so with the next occurrences of this string.
This is what I have working, sort of:
if(chartDataString.find("*A") == string::npos){ return;}
else{chartDataString.replace(chartDataString.find("*A"), 3,"[A]\n");}
If it doesn't find the string, nothing prints at all, so that's not good.
I know I need to loop through the entire string chartDataString and replace all occurrences. I know there are a lot of similar posts to this but I don't understand (like this Replace substring with another substring C++)
I've also tried to do something like this to loop over the string:
string toSearch = chartDataString;
string toFind = "*A:";
for (int i = 0; i<toSearch.length() - toFind.length(); i++){
if(toSearch.substr(i, toFind.length()) == toFind){
chartDataString.replace(chartDataString.find(toFind), 3, "[A]\n");
}
}
EDIT
taking into consideration suggestions, this in theory should work, but I don't know why it doesn't
size_t startPos=0;
string myString = "*A";
while(string::npos != (startPos = chartDataString.find(myString, startPos))){
chartDataString.replace(chartDataString.find(myString, startPos), 3, "*A\n");
startPos = startPos + myString.length();
}

try the following
const std::string s = "*A";
const std::string t = "*A\n";
std::string::size_type n = 0;
while ( ( n = chartDataString.find( s, n ) ) != std::string::npos )
{
chartDataString.replace( n, s.size(), t );
n += t.size();
}

In case boost is available, you can use the following:
std::string origStr = "this string has *A and then another *A";
std::string subStringToRemove = "*A";
std::string subStringToReplace = "[A]";
boost::replace_all(origStr , subStringToRemove , subStringToReplace);
To perform the modification on the original string, OR
std::string result = boost::replace_all_copy(origStr , subStringToRemove , subStringToReplace);
To perform the modifications without modifying the original string.

Use std::regex_replace available with C++11. This does exactly what you want and more.
https://en.cppreference.com/w/cpp/regex/regex_replace
std::string const result = std::regex_replace( chartDataString, std::regex( "\\*A" ), "[A]\n" );

/// Returns a version of 'str' where every occurrence of
/// 'find' is substituted by 'replace'.
/// - Inspired by James Kanze.
/// - http://stackoverflow.com/questions/20406744/
std::string replace_all(
const std::string & str , // where to work
const std::string & find , // substitute 'find'
const std::string & replace // by 'replace'
) {
using namespace std;
string result;
size_t find_len = find.size();
size_t pos,from=0;
while ( string::npos != ( pos=str.find(find,from) ) ) {
result.append( str, from, pos-from );
result.append( replace );
from = pos + find_len;
}
result.append( str, from , string::npos );
return result;
/*
This code might be an improvement to James Kanze's
because it uses std::string methods instead of
general algorithms [as 'std::search()'].
*/
}
int main() {
{
std::string test = "*A ... *A ... *A ...";
std::string changed = "*A\n ... *A\n ... *A\n ...";
assert( changed == replace_all( test, "*A", "*A\n" ) );
}
{
std::string GB = "My gorila ate the banana";
std::string gg = replace_all( GB, "gorila", "banana" );
assert( gg == "My banana ate the banana" );
gg = replace_all( gg, "banana", "gorila" );
assert( gg == "My gorila ate the gorila" );
std::string bb = replace_all( GB, "banana", "gorila" );
assert( gg == "My gorila ate the gorila" );
bb = replace_all( bb, "gorila" , "banana" );
assert( bb == "My banana ate the banana" );
}
{
std::string str, res;
str.assign( "ababaabcd" );
res = replace_all( str, "ab", "fg");
assert( res == "fgfgafgcd" );
str="aaaaaaaa"; assert( 8==str.size() );
res = replace_all( str, "aa", "a" );
assert( res == "aaaa" );
assert( "" == replace_all( str, "aa", "" ) );
str = "aaaaaaa"; assert( 7==str.size() );
res = replace_all( str, "aa", "a" );
assert( res == "aaaa" );
str = "..aaaaaa.."; assert( 10==str.size() );
res = replace_all( str, "aa", "a" );
assert( res == "..aaa.." );
str = "baaaac"; assert( 6==str.size() );
res = replace_all( str, "aa", "" );
assert( res == "bc" );
}
}

The find function takes an optional second argument: the position from which to begin searching. By default this is zero.
A good position to begin searching for the next match is the position where the previous replacement was inserted, plus that replacement's length. For instance if we insert a string of length 3 at position 7, then the next find should begin at position 10.
If the search string happens to be a substring of the replacement, this approach will avoid an infinite loop. Imagine if you try to replace all occurrences of log with analog, but don't skip over the replacement.

It's fairly awkward (and probably not too efficient) to do it in
place. I usually use a function along the lines of:
std::string
replaceAll( std::string const& original, std::string const& from, std::string const& to )
{
std::string results;
std::string::const_iterator end = original.end();
std::string::const_iterator current = original.begin();
std::string::const_iterator next = std::search( current, end, from.begin(), from.end() );
while ( next != end ) {
results.append( current, next );
results.append( to );
current = next + from.size();
next = std::search( current, end, from.begin(), from.end() );
}
results.append( current, next );
return results;
}
Basically, you loop as long as you can find an instance of
from, appending the intermediate text and to, and advancing
to the next instance of from. At the end, you append any text
after the last instance of from.
(If you're going to do much programming in C++, it's probably
a good idea to get used to using iterators, like the above,
rather than the special member functions of std::string.
Things like the above can be made to work with any of the C++
container types, and for this reason, are more idiomatic.)

Below is a complete display of how find, string::replace and replace working.
There is no direct implementation of replaceAll in cpp.
We can tweak replace to perform our intent:
string original = "A abc abc abc A";
string test = original;
cout << endl << "Original string: " << original; //output: A abc abc abc A
//FINDING INDEX WHERE QUERY SUBSTRING FOUND
int index = test.find("a");
cout << endl << "index: " << index; //output: 2
int outOfBoundIndex = test.find("xyz");
cout << endl << "outOfBoundIndex: " << outOfBoundIndex; //output: -1
//REPLACE SINGLE OCCURENCES
string queryString = "abc";
int queryStringLength = queryString.size();
index = test.find(queryString);
if(index > -1 && index < (test.size() - 1))
test.replace(index, queryStringLength, "xyz");
cout << endl << endl << "first occurrence \'abc\' replaced to \'xyz\': " << test; //output: A xyz abc abc A
//REPLACE ALL OCCURRENCES
test = original;
//there is a cpp utility function to replace all occurrence of single character. It will not work for replacing all occurences of string.
replace(test.begin(), test.end(), 'a', 'X');
cout << endl << endl << "Replacing all occurences of character \'a\' with \'X\': " << test; //output: A Xbc Xbc Xbc A
test = original;
index = test.find("abc");
while(index > -1 && index < (test.size() - 1)){
test.replace(index, queryStringLength, "xyz");
index = test.find("abc");
}
cout << endl << "replaceAll implementation: " << test; //output: A xyz xyz xyz A

string replaceAll(string del, string replace, string line){
int len=del.length();
string output="[Programming Error]";
if(line.find(del)!=-1){
do{
output=line.replace(line.find(del),len,replace);
}while(output.find(del)!=-1);
}
return output;
}

If ever the strings you need to invert are not of the same size:
void Replace::replace(std::string & str, std::string const & s1, std::string const & s2)
{
size_t pos = 0;
while ((pos = str.find(s1, pos)) != std::string::npos)
{
str.erase(pos, s1.length());
str.insert(pos, s2);
pos += s2.length();
}
}

Extract substrings of a filename

In C/C++, how can I extract from c:\Blabla - dsf\blup\AAA - BBB\blabla.bmp the substrings AAA and BBB ?
i.e. extract the parts before and after - in the last folder of a filename.
Thanks in advance.
(PS: if possible, with no Framework .net or such things, in which I could easily get lost)

#include <iostream>
using namespace std;
#include <windows.h>
#include <Shlwapi.h> // link with shlwapi.lib
int main()
{
char buffer_1[ ] = "c:\\Blabla - dsf\\blup\\AAA - BBB\\blabla.bmp";
char *lpStr1 = buffer_1;
// Remove the file name from the string
PathRemoveFileSpec(lpStr1);
string s(lpStr1);
// Find the last directory name
stringstream ss(s.substr(s.rfind('\\') + 1));
// Split the last directory name into tokens separated by '-'
while (getline(ss, s, '-'))
cout << s << endl;
}
Explanation in comments.
This doesn't trim leading spaces - in the output - if you also want to do that - check this.

This can relatively easily be done with regular expressions:
std::regex if you have C++11; boost::regex if you don't:
static std::regex( R"(.*\\(\w+)\s*-\s*(\w+)\\[^\\]*$" );
smatch results;
if ( std::regex_match( path, results, regex ) ) {
std::string firstMatch = results[1];
std::string secondMatch = results[2];
// ...
}
Also, you definitely should have the functions split and
trim in toolkit:
template <std::ctype_base::mask test>
class IsNot
{
std::locale ensureLifetime;
std::ctype<char> const* ctype; // Pointer to allow assignment
public:
Is( std::locale const& loc = std::locale() )
: ensureLifetime( loc )
, ctype( &std::use_facet<std::ctype<char>>( loc ) )
{
}
bool operator()( char ch ) const
{
return !ctype->is( test, ch );
}
};
typedef IsNot<std::ctype_base::space> IsNotSpace;
std::vector<std::string>
split( std::string const& original, char separator )
{
std::vector<std::string> results;
std::string::const_iterator current = original.begin();
std::string::const_iterator end = original.end();
std::string::const_iterator next = std::find( current, end, separator );
while ( next != end ) {
results.push_back( std::string( current, next ) );
current = next + 1;
next = std::find( current, end, separator );
}
results.push_back( std::string( current, next ) );
return results;
}
std::string
trim( std::string const& original )
{
std::string::const_iterator end
= std::find_if( original.rbegin(), original.rend(), IsNotSpace() ).base();
std::string::const_iterator begin
= std::find_if( original.begin(), end, IsNotSpace() );
return std::string( begin, end );
}
(These are just the ones you need here. You'll obviously want
the full complement of IsXxx and IsNotXxx predicates, a split
which can split according to a regular expression, a trim which
can be passed a predicate object specifying what is to be
trimmed, etc.)
Anyway, the application of split and trim should be obvious
to give you what you want.

This does all the work and validations in plain C:
int FindParts(const char* source, char** firstOut, char** secondOut)
{
const char* last = NULL;
const char* previous = NULL;
const char* middle = NULL;
const char* middle1 = NULL;
const char* middle2 = NULL;
char* first;
char* second;
last = strrchr(source, '\\');
if (!last || (last == source))
return -1;
--last;
if (last == source)
return -1;
previous = last;
for (; (previous != source) && (*previous != '\\'); --previous);
++previous;
{
middle = strchr(previous, '-');
if (!middle || (middle > last))
return -1;
middle1 = middle-1;
middle2 = middle+1;
}
// now skip spaces
for (; (previous != middle1) && (*previous == ' '); ++previous);
if (previous == middle1)
return -1;
for (; (middle1 != previous) && (*middle1 == ' '); --middle1);
if (middle1 == previous)
return -1;
for (; (middle2 != last) && (*middle2 == ' '); ++middle2);
if (middle2 == last)
return -1;
for (; (middle2 != last) && (*last == ' '); --last);
if (middle2 == last)
return -1;
first = (char*)malloc(middle1-previous+1 + 1);
second = (char*)malloc(last-middle2+1 + 1);
if (!first || !second)
{
free(first);
free(second);
return -1;
}
strncpy(first, previous, middle1-previous+1);
first[middle1-previous+1] = '\0';
strncpy(second, middle2, last-middle2+1);
second[last-middle2+1] = '\0';
*firstOut = first;
*secondOut = second;
return 1;
}

The plain C++ solution (without boost, nor C++11), still the regex solution of James Kanze (https://stackoverflow.com/a/16605408/1032277) is the most generic and elegant:
inline void Trim(std::string& source)
{
size_t position = source.find_first_not_of(" ");
if (std::string::npos != position)
source = source.substr(position);
position = source.find_last_not_of(" ");
if (std::string::npos != position)
source = source.substr(0, position+1);
}
inline bool FindParts(const std::string& source, std::string& first, std::string& second)
{
size_t last = source.find_last_of('\\');
if ((std::string::npos == last) || !last)
return false;
size_t previous = source.find_last_of('\\', last-1);
if (std::string::npos == last)
previous = -1;
size_t middle = source.find_first_of('-',1+previous);
if ((std::string::npos == middle) || (middle > last))
return false;
first = source.substr(1+previous, (middle-1)-(1+previous)+1);
second = source.substr(1+middle, (last-1)-(1+middle)+1);
Trim(first);
Trim(second);
return true;
}

Use std::string rfind rfind (char c, size_t pos = npos)
Find character '\' from the end using rfind (pos1)
Find next character '\' using rfind (pos2)
Get the substring between the positions pos2 and pos1. Use substring function for that.
Find character '-' (pos3)
Extract 2 substrings between pos3 and pos1, pos3 and pos2
Remove the spaces in the substrings.
Resulting substrings will be AAA and BBB

set<string>: how to list not strings starting with given string and ending with `/`?

for example we have in our set:
bin/obj/Debug/CloudServerPrototype/ra.write.1.tlog
bin/obj/Debug/CloudServerPrototype/rc.write.1.tlog
bin/obj/Debug/vc100.idb
bin/obj/Debug/vc100.pdb
So this is what I tried based on this grate answer:
#include <iostream>
#include <algorithm>
#include <set>
#include <string>
#include <iterator>
using namespace std;
struct get_pertinent_part
{
const std::string given_string;
get_pertinent_part(const std::string& s)
:given_string(s)
{
}
std::string operator()(const std::string& s)
{
std::string::size_type first = 0;
if (s.find(given_string) == 0)
{
first = given_string.length() + 1;
}
std::string::size_type count = std::string::npos;
std::string::size_type pos = s.find_last_of("/");
if (pos != std::string::npos && pos > first)
{
count = pos + 1 - first;
}
return s.substr(first, count);
}
};
void directory_listning_without_directories_demo()
{
set<string> output;
set<string> demo_set;
demo_set.insert("file1");
demo_set.insert("file2");
demo_set.insert("folder/file1");
demo_set.insert("folder/file2");
demo_set.insert("folder/folder/file1");
demo_set.insert("folder/folder/file2");
demo_set.insert("bin/obj/Debug/CloudServerPrototype/ra.write.1.tlog");
demo_set.insert("bin/obj/Debug/CloudServerPrototype/rc.write.1.tlog");
demo_set.insert("bin/obj/Debug/vc100.idb");
demo_set.insert("bin/obj/Debug/vc100.pdb");
std::transform(demo_set.begin(),
demo_set.end(),
std::inserter(output, output.end()),
get_pertinent_part("bin/obj/Debug/"));
std::copy(output.begin(),
output.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
}
int main()
{
directory_listning_without_directories_demo();
cin.get();
return 0;
}
This outputs:
CloudServerPrototype/
file1
file2
folder/
folder/folder/
vc100.idb
vc100.pdb
and we are given with bin/obj/Debug/string. We want to cout:
vc100.idb
vc100.pdb
CloudServerPrototype/
How to do such thing?

Quick example of what you want to do.
String.find(): http://www.cplusplus.com/reference/string/string/find/
String.subStr(): http://www.cplusplus.com/reference/string/string/substr/
string str = "bin/obj/Debug/vc100.pdb";
string checkString ("bin/obj/Debug");
// Check if string starts with the check string
if (str.find(checkString) == 0){
// Check if last letter if a "/"
if(str.substr(str.length()-1,1) == "/"){
// Output strating at the end of the check string and for
// the differnce in the strings.
cout << str.substr(checkString.length(), (str.length() - checkString.length()) ) << endl;
}
}

It's not clear with which part of the problem you are stuck, so here is a starter for you.
To get the parts of the strings between "given string" and the final '/' (where present):
std::string get_pertinent_part(const std::string& s)
{
std::string::size_type first = 0;
if (s.find(given_string) == 0)
{
first = given_string.length() + 1;
}
std::string::size_type count = std::string::npos;
std::string::size_type pos = s.find_last_of("/");
if (pos != std::string::npos && pos > first)
{
count = pos + 1 - first;
}
return s.substr(first, count);
}
To insert these parts into a new set (output) to guarantee uniqueness you can use the following:
std::transform(your_set.begin(),
your_set.end(),
std::inserter(output, output.end()),
get_pertinent_part);
You may wish to pass given_string into get_pertinent_part(), in which case you'll need to convert it to a functor:
struct get_pertinent_part
{
const std::string given_string;
get_pertinent_part(const std::string& s)
:given_string(s)
{
}
std::string operator()(const std::string& s)
{
std::string::size_type first = 0;
//
// ...same code as before...
//
return s.substr(first, count);
}
};
You can then call it this way:
std::transform(your_set.begin(),
your_set.end(),
std::inserter(output, output.end()),
get_pertinent_part("bin/obj/Debug"));
To output the new set:
std::copy(output.begin(),
output.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
Sorting the results is left as an exercise.

The easiest way I can think of, using the standard C functions, would be:
char * string1 = "bin/obj/Debug"
char * string2 = "bin/obj/Debug/CloudServerPrototype/rc.write.1.tlog"
char result[64];
// the above code is just to bring the strings into this example
char * position = strstr(string1, string2);
int substringLength;
if(position != NULL){
position += strlen(string2);
substringLength = strchr(position, '/') - position;
strncpy(result, position, substringLength);
}else{
strcpy(result, string1); // this case is for when your first string is not found
}
cout << result;
The first thing that occurs, is finding the substring, string1, in the string we are analyzing, being string2. Once we found the starting point, and assuming it was there at all, we add the length of that substring to that starting point using pointer arithmatic, and then find the resulting string's length by subtracting the starting position from the ending position, which is found with strchr(position, '/'). Then we simply copy that substring into a buffer and it's there to print with cout.
I am sure there is a fancy way of doing this with std::string, but I'll leave that to anyone who can better explain c++ strings, I never did manage to get comfortable with them, haha

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Finding all wanted words in a string - c++

I have a string which is too long, I want to find and locate all of the wanted words. For example I want to find the locations of all "apple"s in the string. Can you tell me how I do that? Thanks

void findApples(const char* someString) { const char* loc = NULL; while ((loc = strstr(someString, "apple")) != NULL) { // do something someString = loc + strlen("apple"); } }

Related

Replace a subString in an std::string but not all of them using C++

C++ function to replace in a string all occurrences of a given substring

How to find and replace all occurrences of a substring in a string?

Extract substrings of a filename

set<string>: how to list not strings starting with given string and ending with `/`?

Categories

Resources