I have a string that I would like to tokenize.
But the C strtok() function requires my string to be a char*.
How can I do this simply?
I tried:
token = strtok(str.c_str(), " ");
which fails because it turns it into a const char*, not a char*
#include <iostream>
#include <string>
#include <sstream>
int main(){
std::string myText("some-text-to-tokenize");
std::istringstream iss(myText);
std::string token;
while (std::getline(iss, token, '-'))
{
std::cout << token << std::endl;
}
return 0;
}
Or, as mentioned, use boost for more flexibility.
Duplicate the string, tokenize it, then free it.
char *dup = strdup(str.c_str());
token = strtok(dup, " ");
free(dup);
If boost is available on your system (I think it's standard on most Linux distros these days), it has a Tokenizer class you can use.
If not, then a quick Google turns up a hand-rolled tokenizer for std::string that you can probably just copy and paste. It's very short.
And, if you don't like either of those, then here's a split() function I wrote to make my life easier. It'll break a string into pieces using any of the chars in "delim" as separators. Pieces are appended to the "parts" vector:
void split(const string& str, const string& delim, vector<string>& parts) {
size_t start, end = 0;
while (end < str.size()) {
start = end;
while (start < str.size() && (delim.find(str[start]) != string::npos)) {
start++; // skip initial whitespace
}
end = start;
while (end < str.size() && (delim.find(str[end]) == string::npos)) {
end++; // skip to end of word
}
if (end-start != 0) { // just ignore zero-length strings.
parts.push_back(string(str, start, end-start));
}
}
}
There is a more elegant solution.
With std::string you can use resize() to allocate a suitably large buffer, and &s[0] to get a pointer to the internal buffer.
At this point many fine folks will jump and yell at the screen. But this is the fact. About 2 years ago
the library working group decided (meeting at Lillehammer) that just like for std::vector, std::string should also formally, not just in practice, have a guaranteed contiguous buffer.
The other concern is does strtok() increases the size of the string. The MSDN documentation says:
Each call to strtok modifies strToken by inserting a null character after the token returned by that call.
But this is not correct. Actually the function replaces the first occurrence of a separator character with \0. No change in the size of the string. If we have this string:
one-two---three--four
we will end up with
one\0two\0--three\0-four
So my solution is very simple:
std::string str("some-text-to-split");
char seps[] = "-";
char *token;
token = strtok( &str[0], seps );
while( token != NULL )
{
/* Do your thing */
token = strtok( NULL, seps );
}
Read the discussion on http://www.archivum.info/comp.lang.c++/2008-05/02889/does_std::string_have_something_like_CString::GetBuffer
With C++17 str::string receives data() overload that returns a pointer to modifieable buffer so string can be used in strtok directly without any hacks:
#include <string>
#include <iostream>
#include <cstring>
#include <cstdlib>
int main()
{
::std::string text{"pop dop rop"};
char const * const psz_delimiter{" "};
char * psz_token{::std::strtok(text.data(), psz_delimiter)};
while(nullptr != psz_token)
{
::std::cout << psz_token << ::std::endl;
psz_token = std::strtok(nullptr, psz_delimiter);
}
return EXIT_SUCCESS;
}
output
pop
dop
rop
EDIT: usage of const cast is only used to demonstrate the effect of strtok() when applied to a pointer returned by string::c_str().
You should not use
strtok() since it modifies the tokenized string which may lead to undesired, if not undefined, behaviour as the C string "belongs" to the string instance.
#include <string>
#include <iostream>
int main(int ac, char **av)
{
std::string theString("hello world");
std::cout << theString << " - " << theString.size() << std::endl;
//--- this cast *only* to illustrate the effect of strtok() on std::string
char *token = strtok(const_cast<char *>(theString.c_str()), " ");
std::cout << theString << " - " << theString.size() << std::endl;
return 0;
}
After the call to strtok(), the space was "removed" from the string, or turned down to a non-printable character, but the length remains unchanged.
>./a.out
hello world - 11
helloworld - 11
Therefore you have to resort to native mechanism, duplication of the string or an third party library as previously mentioned.
I suppose the language is C, or C++...
strtok, IIRC, replace separators with \0. That's what it cannot use a const string.
To workaround that "quickly", if the string isn't huge, you can just strdup() it. Which is wise if you need to keep the string unaltered (what the const suggest...).
On the other hand, you might want to use another tokenizer, perhaps hand rolled, less violent on the given argument.
Assuming that by "string" you're talking about std::string in C++, you might have a look at the Tokenizer package in Boost.
First off I would say use boost tokenizer.
Alternatively if your data is space separated then the string stream library is very useful.
But both the above have already been covered.
So as a third C-Like alternative I propose copying the std::string into a buffer for modification.
std::string data("The data I want to tokenize");
// Create a buffer of the correct length:
std::vector<char> buffer(data.size()+1);
// copy the string into the buffer
strcpy(&buffer[0],data.c_str());
// Tokenize
strtok(&buffer[0]," ");
If you don't mind open source, you could use the subbuffer and subparser classes from https://github.com/EdgeCast/json_parser. The original string is left intact, there is no allocation and no copying of data. I have not compiled the following so there may be errors.
std::string input_string("hello world");
subbuffer input(input_string);
subparser flds(input, ' ', subparser::SKIP_EMPTY);
while (!flds.empty())
{
subbuffer fld = flds.next();
// do something with fld
}
// or if you know it is only two fields
subbuffer fld1 = input.before(' ');
subbuffer fld2 = input.sub(fld1.length() + 1).ltrim(' ');
Typecasting to (char*) got it working for me!
token = strtok((char *)str.c_str(), " ");
Chris's answer is probably fine when using std::string; however in case you want to use std::basic_string<char16_t>, std::getline can't be used. Here is a possible other implementation:
template <class CharT> bool tokenizestring(const std::basic_string<CharT> &input, CharT separator, typename std::basic_string<CharT>::size_type &pos, std::basic_string<CharT> &token) {
if (pos >= input.length()) {
// if input is empty, or ends with a separator, return an empty token when the end has been reached (and return an out-of-bound position so subsequent call won't do it again)
if ((pos == 0) || ((pos > 0) && (pos == input.length()) && (input[pos-1] == separator))) {
token.clear();
pos=input.length()+1;
return true;
}
return false;
}
typename std::basic_string<CharT>::size_type separatorPos=input.find(separator, pos);
if (separatorPos == std::basic_string<CharT>::npos) {
token=input.substr(pos, input.length()-pos);
pos=input.length();
} else {
token=input.substr(pos, separatorPos-pos);
pos=separatorPos+1;
}
return true;
}
Then use it like this:
std::basic_string<char16_t> s;
std::basic_string<char16_t> token;
std::basic_string<char16_t>::size_type tokenPos=0;
while (tokenizestring(s, (char16_t)' ', tokenPos, token)) {
...
}
It fails because str.c_str() returns constant string but char * strtok (char * str, const char * delimiters ) requires volatile string. So you need to use *const_cast< char > inorder to make it voletile.
I am giving you a complete but small program to tokenize the string using C strtok() function.
#include <iostream>
#include <string>
#include <string.h>
using namespace std;
int main() {
string s="20#6 5, 3";
// strtok requires volatile string as it modifies the supplied string in order to tokenize it
char *str=const_cast< char *>(s.c_str());
char *tok;
tok=strtok(str, "#, " );
int arr[4], i=0;
while(tok!=NULL){
arr[i++]=stoi(tok);
tok=strtok(NULL, "#, " );
}
for(int i=0; i<4; i++) cout<<arr[i]<<endl;
return 0;
}
NOTE: strtok may not be suitable in all situation as the string passed to function gets modified by being broken into smaller strings. Pls., ref to get better understanding of strtok functionality.
How strtok works
Added few print statement to better understand the changes happning to string in each call to strtok and how it returns token.
#include <iostream>
#include <string>
#include <string.h>
using namespace std;
int main() {
string s="20#6 5, 3";
char *str=const_cast< char *>(s.c_str());
char *tok;
cout<<"string: "<<s<<endl;
tok=strtok(str, "#, " );
cout<<"String: "<<s<<"\tToken: "<<tok<<endl;
while(tok!=NULL){
tok=strtok(NULL, "#, " );
cout<<"String: "<<s<<"\t\tToken: "<<tok<<endl;
}
return 0;
}
Output:
string: 20#6 5, 3
String: 206 5, 3 Token: 20
String: 2065, 3 Token: 6
String: 2065 3 Token: 5
String: 2065 3 Token: 3
String: 2065 3 Token:
strtok iterate over the string first call find the non delemetor character (2 in this case) and marked it as token start then continues scan for a delimeter and replace it with null charater (# gets replaced in actual string) and return start which points to token start character( i.e., it return token 20 which is terminated by null). In subsequent call it start scaning from the next character and returns token if found else null. subsecuntly it returns token 6, 5, 3.
Related
I wrote a program to remove spaces from string in c++. But when i compile this program i got two errors. They are error: initializer fails to determine size of ‘str1’ and error: array must be initialized with a brace-enclosed initializer.
Can any one show me the error of my code. I am new for C++. I mentioned my code below
#include <stack>
#include <string>
#include <iostream>
#include <algorithm>
using namespace std;
char *delSpaces(char *str)
{
int i = 0, j = 0;
while (str[i])
{
if (str[i] != ' ')
str[j++] = str[i];
i++;
}
str[j] = '\0';
return str;
}
int main(){
string s;
cin >> s;
char str1[]=s;
cout << delSpaces(str1);
}
char str1[]=s;
is not valid C++.
I recommend changing your function to take a string argument by reference.
But if you can't do that, then one way to get read/write access to the char buffer of non-const s, since C++11, is
&s[0]
Since C++17 you can also use s.data().
However, note that your delSpaces creates a zero-terminated string in the supplied buffer, without knowing anything about a string. Since a string can hold any binary data s.length() would be unaffected by the call to delSpaces. You could fix that by adding a call to s.resize(), but again, a better approach is to express delSpaces with a string& argument.
You can't initialize a char[] buffer with a std::string object, that is why you are getting errors.
You would have to do something more like this instead:
char *str1 = new char[s.size()+1];
std::copy(s.begin(), s.end(), str1);
str1[s.size()] = '\0';
std::cout << delSpaces(str1);
delete[] str1;
A better option would be to change the delSpaces() function to take a std::string by reference instead:
void delSpaces(std::string &str) {
size_t j = 0;
for (size_t i = 0; i < str.size(); ++i) {
if (str[i] != ' ')
str[j++] = str[i];
}
str.resize(j);
}
...
std::string s;
std::cin >> s;
delSpaces(s);
You can even let the STL remove the space characters for you:
void delSpaces(std::string &str) {
str.erase(
std::remove(str.begin(), str.end(), ' '),
str.end()
);
}
That being said, note that by default, when operator>> is reading character data, it ignores leading whitespace and then stops reading when it encounters whitespace, so your use of operator>> in this situation will never return a std::string that has spaces in it, thus your delSpaces() function is pretty useless as shown. Use std::getline() instead, which reads until it encounters a line break, and thus can return a string with spaces in it:
std::getline(std::cin, s);
I got a string and I want to remove all the punctuations from it. How do I do that? I did some research and found that people use the ispunct() function (I tried that), but I cant seem to get it to work in my code. Anyone got any ideas?
#include <string>
int main() {
string text = "this. is my string. it's here."
if (ispunct(text))
text.erase();
return 0;
}
Using algorithm remove_copy_if :-
string text,result;
std::remove_copy_if(text.begin(), text.end(),
std::back_inserter(result), //Store output
std::ptr_fun<int, int>(&std::ispunct)
);
POW already has a good answer if you need the result as a new string. This answer is how to handle it if you want an in-place update.
The first part of the recipe is std::remove_if, which can remove the punctuation efficiently, packing all the non-punctuation as it goes.
std::remove_if (text.begin (), text.end (), ispunct)
Unfortunately, std::remove_if doesn't shrink the string to the new size. It can't because it has no access to the container itself. Therefore, there's junk characters left in the string after the packed result.
To handle this, std::remove_if returns an iterator that indicates the part of the string that's still needed. This can be used with strings erase method, leading to the following idiom...
text.erase (std::remove_if (text.begin (), text.end (), ispunct), text.end ());
I call this an idiom because it's a common technique that works in many situations. Other types than string provide suitable erase methods, and std::remove (and probably some other algorithm library functions I've forgotten for the moment) take this approach of closing the gaps for items they remove, but leaving the container-resizing to the caller.
#include <string>
#include <iostream>
#include <cctype>
int main() {
std::string text = "this. is my string. it's here.";
for (int i = 0, len = text.size(); i < len; i++)
{
if (ispunct(text[i]))
{
text.erase(i--, 1);
len = text.size();
}
}
std::cout << text;
return 0;
}
Output
this is my string its here
When you delete a character, the size of the string changes. It has to be updated whenever deletion occurs. And, you deleted the current character, so the next character becomes the current character. If you don't decrement the loop counter, the character next to the punctuation character will not be checked.
ispunct takes a char value not a string.
you can do like
for (auto c : string)
if (ispunct(c)) text.erase(text.find_first_of(c));
This will work but it is a slow algorithm.
Pretty good answer by Steve314.
I would like to add a small change :
text.erase (std::remove_if (text.begin (), text.end (), ::ispunct), text.end ());
Adding the :: before the function ispunct takes care of overloading .
The problem here is that ispunct() takes one argument being a character, while you are trying to send a string. You should loop over the elements of the string and erase each character if it is a punctuation like here:
for(size_t i = 0; i<text.length(); ++i)
if(ispunct(text[i]))
text.erase(i--, 1);
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main() {
string str = "this. is my string. it's here.";
transform(str.begin(), str.end(), str.begin(), [](char ch)
{
if( ispunct(ch) )
return '\0';
return ch;
});
}
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;//string is defined here.
cout << "Please enter a string with punctuation's: " << endl;//Asking for users input
getline(cin, s);//reads in a single string one line at a time
/* ERROR Check: The loop didn't run at first because a semi-colon was placed at the end
of the statement. Remember not to add it for loops. */
for(auto &c : s) //loop checks every character
{
if (ispunct(c)) //to see if its a punctuation
{
c=' '; //if so it replaces it with a blank space.(delete)
}
}
cout << s << endl;
system("pause");
return 0;
}
Another way you could do this would be as follows:
#include <ctype.h> //needed for ispunct()
string onlyLetters(string str){
string retStr = "";
for(int i = 0; i < str.length(); i++){
if(!ispunct(str[i])){
retStr += str[i];
}
}
return retStr;
This ends up creating a new string instead of actually erasing the characters from the old string, but it is a little easier to wrap your head around than using some of the more complex built in functions.
I tried to apply #Steve314's answer but couldn't get it to work until I came across this note here on cppreference.com:
Notes
Like all other functions from <cctype>, the behavior of std::ispunct
is undefined if the argument's value is neither representable as
unsigned char nor equal to EOF. To use these functions safely with
plain chars (or signed chars), the argument should first be converted
to unsigned char.
By studying the example it provides, I am able to make it work like this:
#include <string>
#include <iostream>
#include <cctype>
#include <algorithm>
int main()
{
std::string text = "this. is my string. it's here.";
std::string result;
text.erase(std::remove_if(text.begin(),
text.end(),
[](unsigned char c) { return std::ispunct(c); }),
text.end());
std::cout << text << std::endl;
}
Try to use this one, it will remove all the punctuation on the string in the text file oky.
str.erase(remove_if(str.begin(), str.end(), ::ispunct), str.end());
please reply if helpful
i got it.
size_t found = text.find('.');
text.erase(found, 1);
I have string like this 1-2,4^,14-56
I am expecting output 2-3,5^,15-57
char input[48];
int temp;
char *pch;
pch = strtok(input, "-,^");
while(pch != NULL)
{
char tempch[10];
temp = atoi(pch);
temp++;
itoa(temp, tempch, 10);
memcpy(pch, tempch, strlen(tempch));
pch = strtok(NULL, "-,^");
}
After running through this if I print input it prints only 2 which is first character of the updated string. It does not print all characters in the string. What is the problem with my code?
For plain C, use the library function strtod. Other than atoi, this can update a pointer to the next unparsed character:
long strtol (const char *restrict str, char **restrict endptr, int base);
...
The strtol() function converts the string in str to a long value. [...] If endptr is not NULL, strtol() stores the address of the first invalid character in *endptr.
Since there may be more than one 'not-a-digit' character between the numbers, skip them with the library function isdigit. I placed this at the start of the loop so it would not accidentally convert a string such as -2,3 to -1,4 -- the initial -2 would be picked up first! (And if that is a problem elsewhere, there is also a strtoul.)
Since it appears you want the result in a char string, I use sprintf to copy the output into a buffer, which must be large enough for your possible input plus extra characters caused by a decimal overflow.
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <errno.h>
#include <limits.h>
int main (void)
{
char *inputString = "1-2,4^,14-56";
char *next_code_at = inputString;
long result;
char dest[100], *dest_ptr;
printf ("%s\n", inputString);
dest[0] = 0;
dest_ptr = dest;
while (next_code_at && *next_code_at)
{
while (*next_code_at && !(isdigit(*next_code_at)))
{
dest_ptr += sprintf (dest_ptr, "%c", *next_code_at);
next_code_at++;
}
if (*next_code_at)
{
result = strtol (next_code_at, &next_code_at, 10);
if (errno)
{
perror ("strtol failed");
return EXIT_FAILURE;
} else
{
if (result < LONG_MAX)
dest_ptr += sprintf (dest_ptr, "%ld", result+1);
else
{
fprintf (stderr, "number too large!\n");
return EXIT_FAILURE;
}
}
}
}
printf ("%s\n", dest);
return EXIT_SUCCESS;
}
Sample run:
Input: 1-2,4^,14-56
Output: 2-3,5^,15-57
There are two major problems with this code:
First of all,
pch = strtok(input, ",");
When applied to the string 1-2,4^,14-56 will return the token 1-2.
When you call atoi("1-2") you'll get 1, which gets converted to 2.
You can fix this by changing the first strtok to pch = strtok(NULL, "-,^");
Second of all, strtok modifies the string, which means that you lose the original delimiter found. As this looks like a homework exercise, I'll leave you to figure out how to get around this.
I think this could by easier using regular expressions(and C++ instead of C of course):
Complete exmaple:
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
int main()
{
// Your test string.
std::string input("1-2,4^,14-56");
// Regex representing a number.
std::regex number("\\d+");
// Iterators for traversing the test string using the number regex.
auto ri_begin = std::sregex_iterator(input.begin(), input.end(), number);
auto ri_end = std::sregex_iterator();
for (auto i = ri_begin; i != ri_end; ++i)
{
std::smatch match = *i; // Match a number.
int value = std::stoi(match.str()); // Convert that number to integer.
std::string replacement = std::to_string(++value); // Increment 1 and convert to string again.
input.replace(match.position(), match.length(), replacement); // Finally replace.
}
std::cout << input << std::endl;
return 0;
}
Output:
2-3,5^,15-57
strtok modifies the string you pass to it. Either use strchr or something like that to find the delimiters or make a copy of the string to work on.
This question already has answers here:
C++ Remove punctuation from String
(12 answers)
Closed 9 years ago.
In my program, I am checking whole cstring, if any spaces or punctuation marks are found, just add empty character to that location but the complilor is giving me an error: empty character constant.
Please help me out, in my loop i am checking like this
if(ispunct(str1[start])) {
str1[start]=''; // << empty character constant.
}
if(isspace(str1[start])) {
str1[start]=''; // << empty character constant.
}
This is where my errors are please correct me.
for eg the word is str,, ing, output should be string.
There is no such thing as an empty character.
If you mean a space then change '' to ' ' (with a space in it).
If you mean NUL then change it to '\0'.
Edit: the answer is no longer relevant now that the OP has edited the question. Leaving up for posterity's sake.
If you're wanting to add a null character, use '\0'. If you're wanting to use a different character, using the appropriate character for that. You can't assign it nothing. That's meaningless. That's like saying
int myHexInt = 0x;
or
long long myIndeger = L;
The compiler will error. Put in the value you wanted. In the char case, that's a value from 0 to 255.
UPDATE:
From the edit to OP's question, it's apparent that he/she wanted to trim a string of punctuation and space characters.
As detailed in the flagged possible duplicate, one way is to use remove_copy_if:
string test = "THisisa test;;';';';";
string temp, finalresult;
remove_copy_if(test.begin(), test.end(), std::back_inserter(temp), ptr_fun<int, int>(&ispunct));
remove_copy_if(temp.begin(), temp.end(), std::back_inserter(finalresult), ptr_fun<int, int>(&isspace));
ORIGINAL
Examining your question, replacing spaces with spaces is redundant, so you really need to figure out how to replace punctuation characters with spaces. You can do so using a comparison function (by wrapping std::ispunct) in tandem with std::replace_if from the STL:
#include <string>
#include <algorithm>
#include <iostream>
#include <cctype>
using namespace std;
bool is_punct(const char& c) {
return ispunct(c);
}
int main() {
string test = "THisisa test;;';';';";
char test2[] = "THisisa test;;';';'; another";
size_t size = sizeof(test2)/sizeof(test2[0]);
replace_if(test.begin(), test.end(), is_punct, ' ');//for C++ strings
replace_if(&test2[0], &test2[size-1], is_punct, ' ');//for c-strings
cout << test << endl;
cout << test2 << endl;
}
This outputs:
THisisa test
THisisa test another
Try this (as you asked for cstring explicitly):
char str1[100] = "str,, ing";
if(ispunct(str1[start]) || isspace(str1[start])) {
strncpy(str1 + start, str1 + start + 1, strlen(str1) - start + 1);
}
Well, doing this just in pure c language, there are more efficient solutions (have a look at #MichaelPlotke's answer for details).
But as you also explicitly ask for c++, I'd recommend a solution as follows:
Note you can use the standard c++ algorithms for 'plain' c-style character arrays also. You just have to place your predicate conditions for removal into a small helper functor and use it with the std::remove_if() algorithm:
struct is_char_category_in_question {
bool operator()(const char& c) const;
};
And later use it like:
#include <string>
#include <algorithm>
#include <iostream>
#include <cctype>
#include <cstring>
// Best chance to have the predicate elided to be inlined, when writing
// the functor like this:
struct is_char_category_in_question {
bool operator()(const char& c) const {
return std::ispunct(c) || std::isspace(c);
}
};
int main() {
static char str1[100] = "str,, ing";
size_t size = strlen(str1);
// Using std::remove_if() is likely to provide the best balance from perfor-
// mance and code size efficiency you can expect from your compiler
// implementation.
std::remove_if(&str1[0], &str1[size + 1], is_char_category_in_question());
// Regarding specification of the range definitions end of the above state-
// ment, note we have to add 1 to the strlen() calculated size, to catch the
// closing `\0` character of the c-style string being copied correctly and
// terminate the result as well!
std::cout << str1 << endl; // Prints: string
}
See this compilable and working sample also here.
As I don't like the accepted answer, here's mine:
#include <stdio.h>
#include <string.h>
#include <cctype>
int main() {
char str[100] = "str,, ing";
int bad = 0;
int cur = 0;
while (str[cur] != '\0') {
if (bad < cur && !ispunct(str[cur]) && !isspace(str[cur])) {
str[bad] = str[cur];
}
if (ispunct(str[cur]) || isspace(str[cur])) {
cur++;
}
else {
cur++;
bad++;
}
}
str[bad] = '\0';
fprintf(stdout, "cur = %d; bad = %d; str = %s\n", cur, bad, str);
return 0;
}
Which outputs cur = 18; bad = 14; str = string
This has the advantage of being more efficient and more readable, hm, well, in a style I happen to like better (see comments for a lengthy debate / explanation).
What is the effective way to replace all occurrences of a character with another character in std::string?
std::string doesn't contain such function but you could use stand-alone replace function from algorithm header.
#include <algorithm>
#include <string>
void some_func() {
std::string s = "example string";
std::replace( s.begin(), s.end(), 'x', 'y'); // replace all 'x' to 'y'
}
The question is centered on character replacement, but, as I found this page very useful (especially Konrad's remark), I'd like to share this more generalized implementation, which allows to deal with substrings as well:
std::string ReplaceAll(std::string str, const std::string& from, const std::string& to) {
size_t start_pos = 0;
while((start_pos = str.find(from, start_pos)) != std::string::npos) {
str.replace(start_pos, from.length(), to);
start_pos += to.length(); // Handles case where 'to' is a substring of 'from'
}
return str;
}
Usage:
std::cout << ReplaceAll(string("Number Of Beans"), std::string(" "), std::string("_")) << std::endl;
std::cout << ReplaceAll(string("ghghjghugtghty"), std::string("gh"), std::string("X")) << std::endl;
std::cout << ReplaceAll(string("ghghjghugtghty"), std::string("gh"), std::string("h")) << std::endl;
Outputs:
Number_Of_Beans
XXjXugtXty
hhjhugthty
EDIT:
The above can be implemented in a more suitable way, in case performance is of your concern, by returning nothing (void) and performing the changes "in-place"; that is, by directly modifying the string argument str, passed by reference instead of by value. This would avoid an extra costly copy of the original string by overwriting it.
Code :
static inline void ReplaceAll2(std::string &str, const std::string& from, const std::string& to)
{
// Same inner code...
// No return statement
}
Hope this will be helpful for some others...
I thought I'd toss in the boost solution as well:
#include <boost/algorithm/string/replace.hpp>
// in place
std::string in_place = "blah#blah";
boost::replace_all(in_place, "#", "#");
// copy
const std::string input = "blah#blah";
std::string output = boost::replace_all_copy(input, "#", "#");
Imagine a large binary blob where all 0x00 bytes shall be replaced by "\1\x30" and all 0x01 bytes by "\1\x31" because the transport protocol allows no \0-bytes.
In cases where:
the replacing and the to-replaced string have different lengths,
there are many occurences of the to-replaced string within the source string and
the source string is large,
the provided solutions cannot be applied (because they replace only single characters) or have a performance problem, because they would call string::replace several times which generates copies of the size of the blob over and over.
(I do not know the boost solution, maybe it is OK from that perspective)
This one walks along all occurrences in the source string and builds the new string piece by piece once:
void replaceAll(std::string& source, const std::string& from, const std::string& to)
{
std::string newString;
newString.reserve(source.length()); // avoids a few memory allocations
std::string::size_type lastPos = 0;
std::string::size_type findPos;
while(std::string::npos != (findPos = source.find(from, lastPos)))
{
newString.append(source, lastPos, findPos - lastPos);
newString += to;
lastPos = findPos + from.length();
}
// Care for the rest after last occurrence
newString += source.substr(lastPos);
source.swap(newString);
}
A simple find and replace for a single character would go something like:
s.replace(s.find("x"), 1, "y")
To do this for the whole string, the easy thing to do would be to loop until your s.find starts returning npos. I suppose you could also catch range_error to exit the loop, but that's kinda ugly.
For completeness, here's how to do it with std::regex.
#include <regex>
#include <string>
int main()
{
const std::string s = "example string";
const std::string r = std::regex_replace(s, std::regex("x"), "y");
}
If you're looking to replace more than a single character, and are dealing only with std::string, then this snippet would work, replacing sNeedle in sHaystack with sReplace, and sNeedle and sReplace do not need to be the same size. This routine uses the while loop to replace all occurrences, rather than just the first one found from left to right.
while(sHaystack.find(sNeedle) != std::string::npos) {
sHaystack.replace(sHaystack.find(sNeedle),sNeedle.size(),sReplace);
}
As Kirill suggested, either use the replace method or iterate along the string replacing each char independently.
Alternatively you can use the find method or find_first_of depending on what you need to do. None of these solutions will do the job in one go, but with a few extra lines of code you ought to make them work for you. :-)
What about Abseil StrReplaceAll? From the header file:
// This file defines `absl::StrReplaceAll()`, a general-purpose string
// replacement function designed for large, arbitrary text substitutions,
// especially on strings which you are receiving from some other system for
// further processing (e.g. processing regular expressions, escaping HTML
// entities, etc.). `StrReplaceAll` is designed to be efficient even when only
// one substitution is being performed, or when substitution is rare.
//
// If the string being modified is known at compile-time, and the substitutions
// vary, `absl::Substitute()` may be a better choice.
//
// Example:
//
// std::string html_escaped = absl::StrReplaceAll(user_input, {
// {"&", "&"},
// {"<", "<"},
// {">", ">"},
// {"\"", """},
// {"'", "'"}});
#include <iostream>
#include <string>
using namespace std;
// Replace function..
string replace(string word, string target, string replacement){
int len, loop=0;
string nword="", let;
len=word.length();
len--;
while(loop<=len){
let=word.substr(loop, 1);
if(let==target){
nword=nword+replacement;
}else{
nword=nword+let;
}
loop++;
}
return nword;
}
//Main..
int main() {
string word;
cout<<"Enter Word: ";
cin>>word;
cout<<replace(word, "x", "y")<<endl;
return 0;
}
Old School :-)
std::string str = "H:/recursos/audio/youtube/libre/falta/";
for (int i = 0; i < str.size(); i++) {
if (str[i] == '/') {
str[i] = '\\';
}
}
std::cout << str;
Result:
H:\recursos\audio\youtube\libre\falta\
For simple situations this works pretty well without using any other library then std::string (which is already in use).
Replace all occurences of character a with character b in some_string:
for (size_t i = 0; i < some_string.size(); ++i) {
if (some_string[i] == 'a') {
some_string.replace(i, 1, "b");
}
}
If the string is large or multiple calls to replace is an issue, you can apply the technique mentioned in this answer: https://stackoverflow.com/a/29752943/3622300
here's a solution i rolled, in a maximal DRI spirit.
it will search sNeedle in sHaystack and replace it by sReplace,
nTimes if non 0, else all the sNeedle occurences.
it will not search again in the replaced text.
std::string str_replace(
std::string sHaystack, std::string sNeedle, std::string sReplace,
size_t nTimes=0)
{
size_t found = 0, pos = 0, c = 0;
size_t len = sNeedle.size();
size_t replen = sReplace.size();
std::string input(sHaystack);
do {
found = input.find(sNeedle, pos);
if (found == std::string::npos) {
break;
}
input.replace(found, len, sReplace);
pos = found + replen;
++c;
} while(!nTimes || c < nTimes);
return input;
}
I think I'd use std::replace_if()
A simple character-replacer (requested by OP) can be written by using standard library functions.
For an in-place version:
#include <string>
#include <algorithm>
void replace_char(std::string& in,
std::string::value_type srch,
std::string::value_type repl)
{
std::replace_if(std::begin(in), std::end(in),
[&srch](std::string::value_type v) { return v==srch; },
repl);
return;
}
and an overload that returns a copy if the input is a const string:
std::string replace_char(std::string const& in,
std::string::value_type srch,
std::string::value_type repl)
{
std::string result{ in };
replace_char(result, srch, repl);
return result;
}
This works! I used something similar to this for a bookstore app, where the inventory was stored in a CSV (like a .dat file). But in the case of a single char, meaning the replacer is only a single char, e.g.'|', it must be in double quotes "|" in order not to throw an invalid conversion const char.
#include <iostream>
#include <string>
using namespace std;
int main()
{
int count = 0; // for the number of occurences.
// final hold variable of corrected word up to the npos=j
string holdWord = "";
// a temp var in order to replace 0 to new npos
string holdTemp = "";
// a csv for a an entry in a book store
string holdLetter = "Big Java 7th Ed,Horstman,978-1118431115,99.85";
// j = npos
for (int j = 0; j < holdLetter.length(); j++) {
if (holdLetter[j] == ',') {
if ( count == 0 )
{
holdWord = holdLetter.replace(j, 1, " | ");
}
else {
string holdTemp1 = holdLetter.replace(j, 1, " | ");
// since replacement is three positions in length,
// must replace new replacement's 0 to npos-3, with
// the 0 to npos - 3 of the old replacement
holdTemp = holdTemp1.replace(0, j-3, holdWord, 0, j-3);
holdWord = "";
holdWord = holdTemp;
}
holdTemp = "";
count++;
}
}
cout << holdWord << endl;
return 0;
}
// result:
Big Java 7th Ed | Horstman | 978-1118431115 | 99.85
Uncustomarily I am using CentOS currently, so my compiler version is below . The C++ version (g++), C++98 default:
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This is not the only method missing from the standard library, it was intended be low level.
This use case and many other are covered by general libraries such as:
POCO
Abseil
Boost
QtCore
QtCore & QString has my preference: it supports UTF8 and uses less templates, which means understandable errors and faster compilation. It uses the "q" prefix which makes namespaces unnecessary and simplifies headers.
Boost often generates hideous error messages and slow compile time.
POCO seems to be a reasonable compromise.
How about replace any character string with any character string using only good-old C string functions?
char original[256]="First Line\nNext Line\n", dest[256]="";
char* replace_this = "\n"; // this is now a single character but could be any string
char* with_this = "\r\n"; // this is 2 characters but could be of any length
/* get the first token */
char* token = strtok(original, replace_this);
/* walk through other tokens */
while (token != NULL) {
strcat(dest, token);
strcat(dest, with_this);
token = strtok(NULL, replace_this);
}
dest should now have what we are looking for.