I have a vector containing strings that follow the format of text_number-number
Eg: Example_45-3
I only want the first number (45 in the example) and nothing else which I am able to do with my current code:
std::vector<std::string> imgNumStrVec;
for(size_t i = 0; i < StrVec.size(); i++){
std::vector<std::string> seglist;
std::stringstream ss(StrVec[i]);
std::string seg, seg2;
while(std::getline(ss, seg, '_')) seglist.push_back(seg);
std::stringstream ss2(seglist[1]);
std::getline(ss2, seg2, '-');
imgNumStrVec.push_back(seg2);
}
Are there more streamlined and simpler ways of doing this? and if so what are they?
I ask purely out of desire to learn how to code better as at the end of the day, the code above does successfully extract just the first number, but it seems long winded and round-about.
You can also use the built in find_first_of and find_first_not_of to find the first "numberstring" in any string.
std::string first_numberstring(std::string const & str)
{
char const* digits = "0123456789";
std::size_t const n = str.find_first_of(digits);
if (n != std::string::npos)
{
std::size_t const m = str.find_first_not_of(digits, n);
return str.substr(n, m != std::string::npos ? m-n : m);
}
return std::string();
}
This should be more efficient than Ashot Khachatryan's solution. Note the use of '_' and '-' instead of "_" and "-". And also, the starting position of the search for '-'.
inline std::string mid_num_str(const std::string& s) {
std::string::size_type p = s.find('_');
std::string::size_type pp = s.find('-', p + 2);
return s.substr(p + 1, pp - p - 1);
}
If you need a number instead of a string, like what Alexandr Lapenkov's solution has done, you may also want to try the following:
inline long mid_num(const std::string& s) {
return std::strtol(&s[s.find('_') + 1], nullptr, 10);
}
updated for C++11
(important note for compiler regex support: for gcc. you need version 4.9 or later. i tested this on g++ version 4.9[1], and 9.2. cppreference.com has in browser compiler that i used.)
Thanks to user #2b-t who found a bug in the c++11 code!
Here is the C++11 code:
#include <iostream>
#include <string>
#include <regex>
using std::cout;
using std::endl;
int main() {
std::string input = "Example_45-3";
std::string output = std::regex_replace(
input,
std::regex("[^0-9]*([0-9]+).*"),
std::string("$1")
);
cout << input << endl;
cout << output << endl;
}
boost solution that only requires C++98
Minimal implementation example that works on many strings (not just strings of the form "text_45-text":
#include <iostream>
#include <string>
using namespace std;
#include <boost/regex.hpp>
int main() {
string input = "Example_45-3";
string output = boost::regex_replace(
input,
boost::regex("[^0-9]*([0-9]+).*"),
string("\\1")
);
cout << input << endl;
cout << output << endl;
}
console output:
Example_45-3
45
Other example strings that this would work on:
"asdfasdf 45 sdfsdf"
"X = 45, sdfsdf"
For this example I used g++ on Linux with #include <boost/regex.hpp> and -lboost_regex. You could also use C++11x regex.
Feel free to edit my solution if you have a better regex.
Commentary:
If there aren't performance constraints, using Regex is ideal for this sort of thing because you aren't reinventing the wheel (by writing a bunch of string parsing code which takes time to write/test-fully).
Additionally if/when your strings become more complex or have more varied patterns regex easily accommodates the complexity. (The question's example pattern is easy enough. But often times a more complex pattern would take 10-100+ lines of code when a one line regex would do the same.)
[1]
[1]
Apparently full support for C++11 <regex> was implemented and released for g++ version 4.9.x and on Jun 26, 2015. Hat tip to SO questions #1 and #2 for figuring out the compiler version needing to be 4.9.x.
Check this out
std::string ex = "Example_45-3";
int num;
sscanf( ex.c_str(), "%*[^_]_%d", &num );
I can think of two ways of doing it:
Use regular expressions
Use an iterator to step through the string, and copy each consecutive digit to a temporary buffer. Break when it reaches an unreasonable length or on the first non-digit after a string of consecutive digits. Then you have a string of digits that you can easily convert.
std::string s = "Example_45-3";
int p1 = s.find("_");
int p2 = s.find("-");
std::string number = s.substr(p1 + 1, p2 - p1 - 1)
The 'best' way to do this in C++11 and later is probably using regular expressions, which combine high expressiveness and high performance when the test is repeated often enough.
The following code demonstrates the basics. You should #include <regex> for it to work.
// The example inputs
std::vector<std::string> inputs {
"Example_0-0", "Example_0-1", "Example_0-2", "Example_0-3", "Example_0-4",
"Example_1-0", "Example_1-1", "Example_1-2", "Example_1-3", "Example_1-4"
};
// The regular expression. A lot of the cost is incurred when building the
// std::regex object, but when it's reused a lot that cost is amortised.
std::regex imgNumRegex { "^[^_]+_([[:digit:]]+)-([[:digit:]]+)$" };
for (const auto &input: inputs){
// This wil contain the match results. Parts of the regular expression
// enclosed in parentheses will be stored here, so in this case: both numbers
std::smatch matchResults;
if (!std::regex_match(input, matchResults, imgNumRegex)) {
// Handle failure to match
abort();
}
// Note that the first match is in str(1). str(0) contains the whole string
std::string theFirstNumber = matchResults.str(1);
std::string theSecondNumber = matchResults.str(2);
std::cout << "The input had numbers " << theFirstNumber;
std::cout << " and " << theSecondNumber << std::endl;
}
Using #Pixelchemist's answer and e.g. std::stoul:
bool getFirstNumber(std::string const & a_str, unsigned long & a_outVal)
{
auto pos = a_str.find_first_of("0123456789");
try
{
if (std::string::npos != pos)
{
a_outVal = std::stoul(a_str.substr(pos));
return true;
}
}
catch (...)
{
// handle conversion failure
// ...
}
return false;
}
Related
I am faced with a simple yet complex challenge today.
In my program, I wish to insert a - character every three characters of a string. How would this be accomplished? Thank you for your help.
#include <iostream>
int main()
{
std::string s = "thisisateststring";
// Desired output: thi-sis-ate-sts-tri-ng
std::cout << s << std::endl;
return 0;
}
There is no need to "build a new string".
Loop a position iteration, starting at 3, incrementing by 4 with each pass, inserting a - at the position indicated. Stop when the next insertion point would breach the string (which has been growing by one with each pass, thus the need for the 4 slot skip):
#include <iostream>
#include <string>
int main()
{
std::string s = "thisisateststring";
for (std::string::size_type i=3; i<s.size(); i+=4)
s.insert(i, 1, '-');
// Desired output: thi-sis-ate-sts-tri-ng
std::cout << s << std::endl;
return 0;
}
Output
thi-sis-ate-sts-tri-ng
just take an empty string and append "-" at every count divisible by 3
#include <iostream>
int main()
{
std::string s = "thisisateststring";
std::string res="";
int count=0;
for(int i=0;i<s.length();i++){
count++;
res+=s[i];
if(count%3==0){
res+="-";
}
}
std::cout << res << std::endl;
return 0;
}
output
thi-sis-ate-sts-tri-ng
A general (and efficient) approach is to build a new string by iterating character-by-character over the existing one, making any desired changes as you go. In this case, every third character you can insert a hyphen:
std::string result;
result.reserve(s.size() + s.size() / 3);
for (size_t i = 0; i != s.size(); ++i) {
if (i != 0 && i % 3 == 0)
result.push_back('-');
result.push_back(s[i]);
}
Simple. Iterate the string and build a new one
Copy each character from the old string to the new one and every time you've copied 3 characters add an extra '-' to the end of the new string and restart your count of copied characters.
Like 99% problems with text, this one can be solved with a regular expression one-liner:
std::regex_replace(input, std::regex{".{3}"}, "$&-")
However, it brings not one, but two new problems:
it is not a very performant solution
regex library is huge and bloats resulting binary
So think twice.
You could write a simple functor to add the hyphens, like this:
#include <iostream>
struct inserter
{
unsigned n = 0u;
void operator()(char c)
{
std::cout << c;
if (++n%3 == 0) std::cout << '-';
}
};
This can be passed to the standard for_each() algorithm:
#include <algorithm>
int main()
{
const std::string s = "thisisateststring";
std::for_each(s.begin(), s.end(), inserter());
std::cout << std::endl;
}
Exercise: extend this class to work with different intervals, output streams, replacement characters and string types (narrow or wide).
First, I will quickly describe my motivation for this and the actual problem:
I deal with large batches of files constantly and more specifically, I find myself having to rename them according to the following rule:
They may all contain words and digits, but only one set of digits is incrementing and not 'constant'. I need to extract those and only those digits and rename the files accordingly. For example:
Foo_1_Bar_2015.jpg
Foo_2_Bar_2015.jpg
Foo_03_Bar_2015.jpg
Foo_4_Bar_2015.jpg
Will be renamed:
1.jpg
2.jpg
3.jpg or 03.jpg (The leading zero can stay or go)
4.jpg
So what we start with is a vector with std::wstring objects for all the filenames in the specified directory. I urge you to stop reading for 3 minutes and think about how to approach this before I continue with my attempts and questions. I don't want my ideas to nudge you in one direction or another and I've always found fresh ideas are the best.
Now, here are two ways that I can think of:
1) Old style C string manipulation and comparisons:
In my mind, this entails parsing each filename and remembering each digit sequence position and length. This is easily stored in a vector or what-not for each file. This works well (basically uses string searches with increasing offsets):
while((offset = filename_.find_first_of(L"0123456789", offset)) != filename.npos)
{
size = filename.find_first_not_of(L"0123456789", offset) - offset;
digit_locations_vec.emplace_back(offset, size);
offset += size;
}
What I have after this is a vector of (Location, Size) pairs for all the digits in the filename, constant (by using the definition in the motivation) or not.
After this, chaos ensues, as you need to cross reference the strings and find out which digits are the ones that need to be extracted. This will grow exponentially with the number of files (which tends to be huge) not to mentioned multiplied by the number of digit sequences in each string. Also, not very readable, maintainable or elegant. No go.
2) Regular Expressions
If there was ever a use for regex's, it's this. Create a regex object out of the first filename and try to match it with what comes next. Success? Instant ability to extract the required number. Failure? Add the offending filename as a new regex object and try to match against the two existing regex's. Rinse and repeat. The regex's would look something like this:
Foo_(\d+)_Bar_(\d+).jpg
or create a regex for each digit sequence separately:
Foo_(\d+)_Bar_2015.jpg
Foo_1_Bar_(\d+).jpg
The rest is cake. Just keep on matching as you go, and in the best case, it might require only one pass! Question is...
What I need to know:
1) Can you think of any other superior way to achieve this? I've been banging my head against the wall for days.
2) Although the cost of string manipulation and vector constructing\destructing may be substantial in the first method, perhaps it pales in comparison to the cost of regex objects. Second method, worst case: as many regex objects as files. Would this be disastrous with potentially thousands of files?
3) The second method can be adjusted for one of two possibilities: Few std::regex object constructions, many regex_match calls or the other way around. Which is more expensive, the construction of the regex object or trying to match a string with it?
For me (gcc4.6.2 32-bit optimizations O3), manual string manipulation was about 2x faster than regular expressions. Not worth the cost.
Example runnable complete code (link with boost_system and boost_regex, or change include if you have regex in the compiler already):
#include <ctime>
#include <cctype>
#include <algorithm>
#include <string>
#include <iostream>
#include <vector>
#include <sstream>
using namespace std;
#include <boost/regex.hpp>
using namespace boost;
/*
Foo_1_Bar_2015.jpg
Foo_1_Bar_2016.jpg
Foo_2_Bar_2016.jpg
Foo_2_Bar_2015.jpg
...
*/
vector<string> generateNames(int lenPerYear, int yearStart, int years);
/*
Foo_1_Bar_2015.jpg -> 1_2015.jpg
Foo_7_Bar_2016.jpg -> 7_2016.jpg
*/
void rename_method_string(const vector<string> & names, vector<string> & renamed);
void rename_method_regex(const vector<string> & names, vector<string> & renamed);
typedef void rename_method_t(const vector<string> & names, vector<string> & renamed);
void testMethod(const vector<string> & names, const string & description, rename_method_t method);
int main()
{
vector<string> names = generateNames(10000, 2014, 100);
cout << "names.size() = " << names.size() << '\n';
cout << '\n';
testMethod(names, "method 1 - string manipulation: ", rename_method_string);
cout << '\n';
testMethod(names, "method 2 - regular expressions: ", rename_method_regex);
return 0;
}
void testMethod(const vector<string> & names, const string & description, rename_method_t method)
{
vector<string> renamed(names.size());
clock_t timeStart = clock();
method(names, renamed);
clock_t timeEnd = clock();
cout << "renamed examples:\n";
for (int i = 0; i < 10 && i < names.size(); ++i)
cout << names[i] << " -> " << renamed[i] << '\n';
cout << description << 1000 * (timeEnd - timeStart) / CLOCKS_PER_SEC << " ms\n";
}
vector<string> generateNames(int lenPerYear, int yearStart, int years)
{
vector<string> result;
for (int year = yearStart, yearEnd = yearStart + years; year < yearEnd; ++year)
{
for (int i = 0; i < lenPerYear; ++i)
{
ostringstream oss;
oss << "Foo_" << i << "_Bar_" << year << ".jpg";
result.push_back(oss.str());
}
}
return result;
}
template<typename T>
bool equal_safe(T itShort, T itShortEnd, T itLong, T itLongEnd)
{
if (itLongEnd - itLong < itShortEnd - itShort)
return false;
return equal(itShort, itShortEnd, itLong);
}
void rename_method_string(const vector<string> & names, vector<string> & renamed)
{
//manually: "Foo_(\\d+)_Bar_(\\d+).jpg" -> \1_\2.jpg
const string foo = "Foo_", bar = "_Bar_", jpg = ".jpg";
for (int i = 0; i < names.size(); ++i)
{
const string & name = names[i];
//starts with foo?
if (!equal_safe(foo.begin(), foo.end(), name.begin(), name.end()))
{
renamed[i] = "ERROR no foo";
continue;
}
//extract number
auto it = name.begin() + foo.size();
for (; it != name.end() && isdigit(*it); ++it) {}
string str_num1(name.begin() + foo.size(), it);
//continues with bar?
if (!equal_safe(bar.begin(), bar.end(), it, name.end()))
{
renamed[i] = "ERROR no bar";
continue;
}
//extract number
it += bar.size();
auto itStart = it;
for (; it != name.end() && isdigit(*it); ++it) {}
string str_num2(itStart, it);
//check *.jpg
if (!equal_safe(jpg.begin(), jpg.end(), it, name.end()))
{
renamed[i] = "ERROR no .jpg";
continue;
}
renamed[i] = str_num1 + "_" + str_num2 + ".jpg";
}
}
void rename_method_regex(const vector<string> & names, vector<string> & renamed)
{
regex searching("Foo_(\\d+)_Bar_(\\d+).jpg");
smatch found;
for (int i = 0; i < names.size(); ++i)
{
if (regex_search(names[i], found, searching))
{
if (3 != found.size())
renamed[i] = "ERROR weird match";
else
renamed[i] = found[1].str() + "_" + found[2].str() + ".jpg";
}
else renamed[i] = "ERROR no match";
}
}
It produces output for me:
names.size() = 1000000
renamed examples:
Foo_0_Bar_2014.jpg -> 0_2014.jpg
Foo_1_Bar_2014.jpg -> 1_2014.jpg
Foo_2_Bar_2014.jpg -> 2_2014.jpg
Foo_3_Bar_2014.jpg -> 3_2014.jpg
Foo_4_Bar_2014.jpg -> 4_2014.jpg
Foo_5_Bar_2014.jpg -> 5_2014.jpg
Foo_6_Bar_2014.jpg -> 6_2014.jpg
Foo_7_Bar_2014.jpg -> 7_2014.jpg
Foo_8_Bar_2014.jpg -> 8_2014.jpg
Foo_9_Bar_2014.jpg -> 9_2014.jpg
method 1 - string manipulation: 421 ms
renamed examples:
Foo_0_Bar_2014.jpg -> 0_2014.jpg
Foo_1_Bar_2014.jpg -> 1_2014.jpg
Foo_2_Bar_2014.jpg -> 2_2014.jpg
Foo_3_Bar_2014.jpg -> 3_2014.jpg
Foo_4_Bar_2014.jpg -> 4_2014.jpg
Foo_5_Bar_2014.jpg -> 5_2014.jpg
Foo_6_Bar_2014.jpg -> 6_2014.jpg
Foo_7_Bar_2014.jpg -> 7_2014.jpg
Foo_8_Bar_2014.jpg -> 8_2014.jpg
Foo_9_Bar_2014.jpg -> 9_2014.jpg
method 2 - regular expressions: 796 ms
Also, I think it's completely pointless, because actual I/O (getting file name, renaming file) will be much slower than any CPU string manipulation, in your example. So to answer your questions:
I don't see any superior way, I/O is what's slow, don't bother with superiority
regex object wasn't expensive in my experience, within 2x slow-down vs manual method, that's constant slow-down and negligible, compared to how much work it saves
How many std::regex objects for how many regex_match calls? Depends on the amount of regex_match calls: more matches there are, more it's worth to create specific std::regex object. However this will be very library-dependant. If there are lots of match calls, create separate, if you are not sure, do not bother.
Why dont you use split to split the string between letters and numbers:
Regex.Split(fileName, "(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");
then get whichever index you need for the numbers, maybe using a Where clause, to find the ones increasing in value, while the others indices are matching, then you can use .Last() to get the extension.
I have a string in form "blah-blah..obj_xx..blah-blah" where xx are digits. E.g. the string may be "root75/obj_43.dat".
I want to read "xx" (or 43 from the sample above) as an integer. How do I do it?
I tried to find "obj_" first:
std::string::size_type const cpos = name.find("obj_");
assert(std::string::npos != cpos);
but what's next?
My GCC doesn't support regexes fully, but I think this should work:
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
int main ()
{
std::string input ("blah-blah..obj_42..blah-blah");
std::regex expr ("obj_([0-9]+)");
std::sregex_iterator i = std::sregex_iterator(input.begin(), input.end(), expr);
std::smatch match = *i;
int number = std::stoi(match.str());
std::cout << number << '\n';
}
With something this simple you can do
auto b = name.find_first_of("0123456789", cpos);
auto e = name.find_first_not_of("0123456789", b);
if (b != std::string::npos)
{
auto digits = name.substr(b, e);
int n = std::stoi(digits);
}
else
{
// Error handling
}
For anything more complicated I would use regex.
How about:
#include <iostream>
#include <string>
int main()
{
const std::string test("root75/obj_43.dat");
int number;
// validate input:
const auto index = test.find("obj_");
if(index != std::string::npos)
{
number = std::stoi(test.substr(index+4));
std::cout << "number: " << number << ".\n";
}
else
std::cout << "Input validation failed.\n";
}
Live demo here. Includes (very) basic input validation (e.g. it will fail if the string contains multiple obj_), variable length numbers at the end, or even more stuff following it (adjust the substr call accordingly) and you can add a second argument to std::stoi to make sure it didn't fail for some reason.
Here's another option
//your code:
std::string::size_type const cpos = name.find("obj_");
assert(std::string::npos != cpos);
//my code starts here:
int n;
std::stringstream sin(name.substr(cpos+4));
sin>>n;
Dirt simple method, though probably pretty inefficient, and doesn't take advantage of the STL:
(Note that I didn't try to compile this)
unsigned GetFileNumber(std::string &s)
{
const std::string extension = ".dat";
/// get starting position - first character to the left of the file extension
/// in a real implementation, you'd want to verify that the string actually contains
/// the correct extension.
int i = (int)(s.size() - extension.size() - 1);
unsigned sum = 0;
int tensMultiplier = 1;
while (i >= 0)
{
/// get the integer value of this digit - subtract (int)'0' rather than
/// using the ASCII code of `0` directly for clarity. Optimizer converts
/// it to a literal immediate at compile time, anyway.
int digit = s[i] - (int)'0';
/// if this is a valid numeric character
if (digit >= 0 && digit <= 9)
{
/// add the digit's value, adjusted for it's place within the numeric
/// substring, to the accumulator
sum += digit * tensMultiplier;
/// set the tens place multiplier for the next digit to the left.
tensMultiplier *= 10;
}
else
{
break;
}
i--;
}
return sum;
}
If you need it as a string, just append the found digits to a result string rather than accumulating their values in sum.
This also assumes that .dat is the last part of your string. If not, I'd start at the end, count left until you find a numeric character, and then start the above loop. This is nice because it's O(n), but may not be as clear as the regex or find approaches.
This question already has answers here:
C++ Remove punctuation from String
(12 answers)
Closed 9 years ago.
In my program, I am checking whole cstring, if any spaces or punctuation marks are found, just add empty character to that location but the complilor is giving me an error: empty character constant.
Please help me out, in my loop i am checking like this
if(ispunct(str1[start])) {
str1[start]=''; // << empty character constant.
}
if(isspace(str1[start])) {
str1[start]=''; // << empty character constant.
}
This is where my errors are please correct me.
for eg the word is str,, ing, output should be string.
There is no such thing as an empty character.
If you mean a space then change '' to ' ' (with a space in it).
If you mean NUL then change it to '\0'.
Edit: the answer is no longer relevant now that the OP has edited the question. Leaving up for posterity's sake.
If you're wanting to add a null character, use '\0'. If you're wanting to use a different character, using the appropriate character for that. You can't assign it nothing. That's meaningless. That's like saying
int myHexInt = 0x;
or
long long myIndeger = L;
The compiler will error. Put in the value you wanted. In the char case, that's a value from 0 to 255.
UPDATE:
From the edit to OP's question, it's apparent that he/she wanted to trim a string of punctuation and space characters.
As detailed in the flagged possible duplicate, one way is to use remove_copy_if:
string test = "THisisa test;;';';';";
string temp, finalresult;
remove_copy_if(test.begin(), test.end(), std::back_inserter(temp), ptr_fun<int, int>(&ispunct));
remove_copy_if(temp.begin(), temp.end(), std::back_inserter(finalresult), ptr_fun<int, int>(&isspace));
ORIGINAL
Examining your question, replacing spaces with spaces is redundant, so you really need to figure out how to replace punctuation characters with spaces. You can do so using a comparison function (by wrapping std::ispunct) in tandem with std::replace_if from the STL:
#include <string>
#include <algorithm>
#include <iostream>
#include <cctype>
using namespace std;
bool is_punct(const char& c) {
return ispunct(c);
}
int main() {
string test = "THisisa test;;';';';";
char test2[] = "THisisa test;;';';'; another";
size_t size = sizeof(test2)/sizeof(test2[0]);
replace_if(test.begin(), test.end(), is_punct, ' ');//for C++ strings
replace_if(&test2[0], &test2[size-1], is_punct, ' ');//for c-strings
cout << test << endl;
cout << test2 << endl;
}
This outputs:
THisisa test
THisisa test another
Try this (as you asked for cstring explicitly):
char str1[100] = "str,, ing";
if(ispunct(str1[start]) || isspace(str1[start])) {
strncpy(str1 + start, str1 + start + 1, strlen(str1) - start + 1);
}
Well, doing this just in pure c language, there are more efficient solutions (have a look at #MichaelPlotke's answer for details).
But as you also explicitly ask for c++, I'd recommend a solution as follows:
Note you can use the standard c++ algorithms for 'plain' c-style character arrays also. You just have to place your predicate conditions for removal into a small helper functor and use it with the std::remove_if() algorithm:
struct is_char_category_in_question {
bool operator()(const char& c) const;
};
And later use it like:
#include <string>
#include <algorithm>
#include <iostream>
#include <cctype>
#include <cstring>
// Best chance to have the predicate elided to be inlined, when writing
// the functor like this:
struct is_char_category_in_question {
bool operator()(const char& c) const {
return std::ispunct(c) || std::isspace(c);
}
};
int main() {
static char str1[100] = "str,, ing";
size_t size = strlen(str1);
// Using std::remove_if() is likely to provide the best balance from perfor-
// mance and code size efficiency you can expect from your compiler
// implementation.
std::remove_if(&str1[0], &str1[size + 1], is_char_category_in_question());
// Regarding specification of the range definitions end of the above state-
// ment, note we have to add 1 to the strlen() calculated size, to catch the
// closing `\0` character of the c-style string being copied correctly and
// terminate the result as well!
std::cout << str1 << endl; // Prints: string
}
See this compilable and working sample also here.
As I don't like the accepted answer, here's mine:
#include <stdio.h>
#include <string.h>
#include <cctype>
int main() {
char str[100] = "str,, ing";
int bad = 0;
int cur = 0;
while (str[cur] != '\0') {
if (bad < cur && !ispunct(str[cur]) && !isspace(str[cur])) {
str[bad] = str[cur];
}
if (ispunct(str[cur]) || isspace(str[cur])) {
cur++;
}
else {
cur++;
bad++;
}
}
str[bad] = '\0';
fprintf(stdout, "cur = %d; bad = %d; str = %s\n", cur, bad, str);
return 0;
}
Which outputs cur = 18; bad = 14; str = string
This has the advantage of being more efficient and more readable, hm, well, in a style I happen to like better (see comments for a lengthy debate / explanation).
What is the effective way to replace all occurrences of a character with another character in std::string?
std::string doesn't contain such function but you could use stand-alone replace function from algorithm header.
#include <algorithm>
#include <string>
void some_func() {
std::string s = "example string";
std::replace( s.begin(), s.end(), 'x', 'y'); // replace all 'x' to 'y'
}
The question is centered on character replacement, but, as I found this page very useful (especially Konrad's remark), I'd like to share this more generalized implementation, which allows to deal with substrings as well:
std::string ReplaceAll(std::string str, const std::string& from, const std::string& to) {
size_t start_pos = 0;
while((start_pos = str.find(from, start_pos)) != std::string::npos) {
str.replace(start_pos, from.length(), to);
start_pos += to.length(); // Handles case where 'to' is a substring of 'from'
}
return str;
}
Usage:
std::cout << ReplaceAll(string("Number Of Beans"), std::string(" "), std::string("_")) << std::endl;
std::cout << ReplaceAll(string("ghghjghugtghty"), std::string("gh"), std::string("X")) << std::endl;
std::cout << ReplaceAll(string("ghghjghugtghty"), std::string("gh"), std::string("h")) << std::endl;
Outputs:
Number_Of_Beans
XXjXugtXty
hhjhugthty
EDIT:
The above can be implemented in a more suitable way, in case performance is of your concern, by returning nothing (void) and performing the changes "in-place"; that is, by directly modifying the string argument str, passed by reference instead of by value. This would avoid an extra costly copy of the original string by overwriting it.
Code :
static inline void ReplaceAll2(std::string &str, const std::string& from, const std::string& to)
{
// Same inner code...
// No return statement
}
Hope this will be helpful for some others...
I thought I'd toss in the boost solution as well:
#include <boost/algorithm/string/replace.hpp>
// in place
std::string in_place = "blah#blah";
boost::replace_all(in_place, "#", "#");
// copy
const std::string input = "blah#blah";
std::string output = boost::replace_all_copy(input, "#", "#");
Imagine a large binary blob where all 0x00 bytes shall be replaced by "\1\x30" and all 0x01 bytes by "\1\x31" because the transport protocol allows no \0-bytes.
In cases where:
the replacing and the to-replaced string have different lengths,
there are many occurences of the to-replaced string within the source string and
the source string is large,
the provided solutions cannot be applied (because they replace only single characters) or have a performance problem, because they would call string::replace several times which generates copies of the size of the blob over and over.
(I do not know the boost solution, maybe it is OK from that perspective)
This one walks along all occurrences in the source string and builds the new string piece by piece once:
void replaceAll(std::string& source, const std::string& from, const std::string& to)
{
std::string newString;
newString.reserve(source.length()); // avoids a few memory allocations
std::string::size_type lastPos = 0;
std::string::size_type findPos;
while(std::string::npos != (findPos = source.find(from, lastPos)))
{
newString.append(source, lastPos, findPos - lastPos);
newString += to;
lastPos = findPos + from.length();
}
// Care for the rest after last occurrence
newString += source.substr(lastPos);
source.swap(newString);
}
A simple find and replace for a single character would go something like:
s.replace(s.find("x"), 1, "y")
To do this for the whole string, the easy thing to do would be to loop until your s.find starts returning npos. I suppose you could also catch range_error to exit the loop, but that's kinda ugly.
For completeness, here's how to do it with std::regex.
#include <regex>
#include <string>
int main()
{
const std::string s = "example string";
const std::string r = std::regex_replace(s, std::regex("x"), "y");
}
If you're looking to replace more than a single character, and are dealing only with std::string, then this snippet would work, replacing sNeedle in sHaystack with sReplace, and sNeedle and sReplace do not need to be the same size. This routine uses the while loop to replace all occurrences, rather than just the first one found from left to right.
while(sHaystack.find(sNeedle) != std::string::npos) {
sHaystack.replace(sHaystack.find(sNeedle),sNeedle.size(),sReplace);
}
As Kirill suggested, either use the replace method or iterate along the string replacing each char independently.
Alternatively you can use the find method or find_first_of depending on what you need to do. None of these solutions will do the job in one go, but with a few extra lines of code you ought to make them work for you. :-)
What about Abseil StrReplaceAll? From the header file:
// This file defines `absl::StrReplaceAll()`, a general-purpose string
// replacement function designed for large, arbitrary text substitutions,
// especially on strings which you are receiving from some other system for
// further processing (e.g. processing regular expressions, escaping HTML
// entities, etc.). `StrReplaceAll` is designed to be efficient even when only
// one substitution is being performed, or when substitution is rare.
//
// If the string being modified is known at compile-time, and the substitutions
// vary, `absl::Substitute()` may be a better choice.
//
// Example:
//
// std::string html_escaped = absl::StrReplaceAll(user_input, {
// {"&", "&"},
// {"<", "<"},
// {">", ">"},
// {"\"", """},
// {"'", "'"}});
#include <iostream>
#include <string>
using namespace std;
// Replace function..
string replace(string word, string target, string replacement){
int len, loop=0;
string nword="", let;
len=word.length();
len--;
while(loop<=len){
let=word.substr(loop, 1);
if(let==target){
nword=nword+replacement;
}else{
nword=nword+let;
}
loop++;
}
return nword;
}
//Main..
int main() {
string word;
cout<<"Enter Word: ";
cin>>word;
cout<<replace(word, "x", "y")<<endl;
return 0;
}
Old School :-)
std::string str = "H:/recursos/audio/youtube/libre/falta/";
for (int i = 0; i < str.size(); i++) {
if (str[i] == '/') {
str[i] = '\\';
}
}
std::cout << str;
Result:
H:\recursos\audio\youtube\libre\falta\
For simple situations this works pretty well without using any other library then std::string (which is already in use).
Replace all occurences of character a with character b in some_string:
for (size_t i = 0; i < some_string.size(); ++i) {
if (some_string[i] == 'a') {
some_string.replace(i, 1, "b");
}
}
If the string is large or multiple calls to replace is an issue, you can apply the technique mentioned in this answer: https://stackoverflow.com/a/29752943/3622300
here's a solution i rolled, in a maximal DRI spirit.
it will search sNeedle in sHaystack and replace it by sReplace,
nTimes if non 0, else all the sNeedle occurences.
it will not search again in the replaced text.
std::string str_replace(
std::string sHaystack, std::string sNeedle, std::string sReplace,
size_t nTimes=0)
{
size_t found = 0, pos = 0, c = 0;
size_t len = sNeedle.size();
size_t replen = sReplace.size();
std::string input(sHaystack);
do {
found = input.find(sNeedle, pos);
if (found == std::string::npos) {
break;
}
input.replace(found, len, sReplace);
pos = found + replen;
++c;
} while(!nTimes || c < nTimes);
return input;
}
I think I'd use std::replace_if()
A simple character-replacer (requested by OP) can be written by using standard library functions.
For an in-place version:
#include <string>
#include <algorithm>
void replace_char(std::string& in,
std::string::value_type srch,
std::string::value_type repl)
{
std::replace_if(std::begin(in), std::end(in),
[&srch](std::string::value_type v) { return v==srch; },
repl);
return;
}
and an overload that returns a copy if the input is a const string:
std::string replace_char(std::string const& in,
std::string::value_type srch,
std::string::value_type repl)
{
std::string result{ in };
replace_char(result, srch, repl);
return result;
}
This works! I used something similar to this for a bookstore app, where the inventory was stored in a CSV (like a .dat file). But in the case of a single char, meaning the replacer is only a single char, e.g.'|', it must be in double quotes "|" in order not to throw an invalid conversion const char.
#include <iostream>
#include <string>
using namespace std;
int main()
{
int count = 0; // for the number of occurences.
// final hold variable of corrected word up to the npos=j
string holdWord = "";
// a temp var in order to replace 0 to new npos
string holdTemp = "";
// a csv for a an entry in a book store
string holdLetter = "Big Java 7th Ed,Horstman,978-1118431115,99.85";
// j = npos
for (int j = 0; j < holdLetter.length(); j++) {
if (holdLetter[j] == ',') {
if ( count == 0 )
{
holdWord = holdLetter.replace(j, 1, " | ");
}
else {
string holdTemp1 = holdLetter.replace(j, 1, " | ");
// since replacement is three positions in length,
// must replace new replacement's 0 to npos-3, with
// the 0 to npos - 3 of the old replacement
holdTemp = holdTemp1.replace(0, j-3, holdWord, 0, j-3);
holdWord = "";
holdWord = holdTemp;
}
holdTemp = "";
count++;
}
}
cout << holdWord << endl;
return 0;
}
// result:
Big Java 7th Ed | Horstman | 978-1118431115 | 99.85
Uncustomarily I am using CentOS currently, so my compiler version is below . The C++ version (g++), C++98 default:
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This is not the only method missing from the standard library, it was intended be low level.
This use case and many other are covered by general libraries such as:
POCO
Abseil
Boost
QtCore
QtCore & QString has my preference: it supports UTF8 and uses less templates, which means understandable errors and faster compilation. It uses the "q" prefix which makes namespaces unnecessary and simplifies headers.
Boost often generates hideous error messages and slow compile time.
POCO seems to be a reasonable compromise.
How about replace any character string with any character string using only good-old C string functions?
char original[256]="First Line\nNext Line\n", dest[256]="";
char* replace_this = "\n"; // this is now a single character but could be any string
char* with_this = "\r\n"; // this is 2 characters but could be of any length
/* get the first token */
char* token = strtok(original, replace_this);
/* walk through other tokens */
while (token != NULL) {
strcat(dest, token);
strcat(dest, with_this);
token = strtok(NULL, replace_this);
}
dest should now have what we are looking for.