Retaining just needed characters in string in C++ - c++

I have a string of the form:
http://stackoverflow.com/q""uestions/ask/%33854#/á
Now I want to delete all characters from this string except alphnumeric and ://.So that the output string becomes:
http://stackoverflow.com/questions/ask/33854/á
I know I can traverse this string character by character and remove unnecessary characters. But is there some function in some standard library which may help me remove unwanted characters. If i know the unwanted characters then I can use std::remove and std::replace to selectively remove or replace. But here I do not know the unknown characters, I only know the characters which I want to retain.
Is there some way by which I may retain only the necessary characters and remove the unwanted characters.
gcc version which I am using is:
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
EDIT: I also want to include characters like á. I dont know what they are called. I know they are not alph-numeric. But I am not getting how to check for them

Since your compiler is ancient and regex support is relatively recent in gcc (from gcc 4.9 forward), regexes are not an option. We'll use the erase-remove idiom, with a named function because Gcc 4.4 does not yet support lambdas.
#include <algorithm>
#include <iostream>
#include <locale>
#include <string>
// true for characters that should be removed
bool is_special_character(char c) {
std::locale loc("your_locale_string_here");
return !std::isalnum(c, loc) && c != ':' && c != '/' && c != '.';
}
int main()
{
std::string s = "http://stackoverflow.com/q\"\"uestions/ask/%33854#";
// interesting part here
s.erase(std::remove_if(s.begin(), s.end(), is_special_character), s.end());
std::cout << s << '\n';
}

You will want to use std::remove_if and define a predicate to return false only if the characters are the ones you want to retain.
You'll also want to resize the string to the new length after you do this process. As an example:
#include <string>
#include <algorithm>
#include <iostream>
#include <locale>
bool is_special_char(char c)
{
return !( std::isalnum(c) || c == ':' || c == '/' || c == '.');
}
int main()
{
std::string s = "http://stackoverflow.com/q\"\"uestions/ask/\%33854#";
std::cout << s << std::endl;
std::string::iterator new_end = std::remove_if(s.begin(), s.end(), is_special_char);
s.resize(new_end - s.begin());
std::cout << s << std::endl;
}
will output
http://stackoverflow.com/q""uestions/ask/%33854#
http://stackoverflow.com/questions/ask/33854
If you want to incorporate unicode characters you need to use a wstring instead of a string, an example using this (and incorporating Wintermute's nice use of the erase/remove idiom) would be.
#include <string>
#include <algorithm>
#include <iostream>
#include <locale>
bool is_special_char(wchar_t c)
{
return !( std::iswalnum(c) || c == ':' || c == '/' || c == '.');
}
int main()
{
std::locale::global( std::locale("en_US.UTF-8") ); //Set the global locale to Unicode
std::wstring s = L"http://stáckoverflow.com/q\"\"uestions/ask/%33854#";
std::wcout << s << std::endl;
s.erase( std::remove_if(s.begin(), s.end(), is_special_char), s.end() );
std::wcout << s << std::endl;
}
which will output
http://stáckoverflow.com/q""uestions/ask/%33854#
http://stáckoverflow.com/questions/ask/33854

But here I do not know the unknown characters, I only know the characters which I want to retain.
Whitelist the characters you want to retain using a char array for example. Then run through each character in your string and remove it if it isn't in the whitelist.

You could try something like that :
std::string str ("This is an example sentence.");
std::cout << str << '\n';
// "This is an example sentence."
str.erase (10,8); // ^^^^^^^^
std::cout << str << '\n';
// "This is an sentence."
str.erase (str.begin()+9); // ^
std::cout << str << '\n';
// "This is a sentence."
str.erase (str.begin()+5, str.end()-9); // ^^^^^
std::cout << str << '\n';
// "This sentence."

Related

how connect regex replace to function in c++

#include <iostream>
#include <iterator>
#include <regex>
#include <string>
std::string ty(std::string text){
if(text == "brown")
return "true";
else
return "qw";
}
int main()
{
std::string text = "Quick $brown fox";
std::cout << '\n' << std::regex_replace(text, std::regex(R"(\\$(.*))"), ty("$&")) << '\n';
}
i use c++11 . I try without if worked but with if don't work ? i don't know what to do
There's a lot of different things wrong with the original code.
Firstly here's some working code
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
std::string ty(std::string text){
if(text == "brown")
return "true";
else
return "qw";
}
int main()
{
std::string text = "Quick $brown fox";
std::smatch m;
if (std::regex_search(text, m, std::regex(R"(\$([[:alpha:]][[:alnum:]]*))")))
{
std::cout << '\n' << ty(std::string(m[1].first, m[1].second)) << '\n';
}
else
{
std::cout << "\nno match\n";
}
}
Some things that were wrong with the original code
Firstly the function being called was wrong. Use std::regex_search to search for matches in a string. Capture the results in an std::smatch object and then use those results to call the ty function.
The regex was wrong in two different ways. Firstly \\ is wrong because you are using a raw string literal, so only a single backslash is required. Secondly (.*) is wrong because that will match the entire rest of the string. You only want to match the word following the dollar. I've used ([[:alpha:]][[:alnum:]]*) instead. That might not be exactly what you want but it works for this example. You can modify it if you want.

Am replacing the first occurrence of a string, how can I replace all occurrences?

I have already bulid the basic structure by using the loop + replace,In C++, the str.replace is only to replace single string, however, in some cases, we need to replace all the same string, My code can compile successfully and can output to the screen, but it seems that it does not replace successfully.
Thanks in advance
here's my code:
#include <iostream>
#include <fstream>
#include <string>
int main(void)
{
// initialize
std::ofstream fout;
std::ifstream fin;
fout.open("cad.dat");
fout << "C is a Computer Programming Language which is used worldwide, Everyone should learn how to use C" << std::endl;
fin.open("cad.dat");
std::string words;
getline(fin,words);
std::cout << words << std::endl;
while(1)
{
std::string::size_type pos(0);
if (pos = words.find("C") != std::string::npos && words[pos+1] != ' ') //the later one is used to detect the single word "C"
{
words.replace(pos, 1, "C++");
}
else
{
break;
}
}
std::cout << words;
}
You can simplify your program by just using regex as follows:
std::regex f("\\bC\\b");
words = std::regex_replace(words, f, "C++"); // replace "C" with "C++"
Then there is no need for a while loop as shown in the below program:
#include <iostream>
#include <fstream>
#include <regex>
#include <string>
int main(void)
{
// initialize
std::ofstream fout;
std::ifstream fin;
fout.open("cad.dat");
fout << "C is a Computer Programming Language which is used worldwide, Everyone should learn how to use C" << std::endl;
fin.open("cad.dat");
std::string words;
getline(fin,words);
std::cout << words << std::endl;
std::regex f("\\bC\\b");
words = std::regex_replace(words, f, "C++"); // replace "C" with "C++"
std::cout << words;
}
You need to save pos and use it for the following find operations but you currently initialize it to 0 every iteration in the while loop.
You could replace the while while loop with this for example:
for(std::string::size_type pos = 0;
(pos = words.find("C", pos)) != std::string::npos; // start find at pos
pos += 1) // skip last found "C"
{
if(pos + 1 == words.size() || words[pos + 1] == ' ')
words.replace(pos, 1, "C++");
}
Note: This will replace the C in words ending with C too, for example an acronym like C2C will become C2C++. Also, a sentence ending with C. would not be handled. To handle these cases you could add a check for the character before the found C too and add punctuation to the check.
Example:
#include <cctype> // isspace, ispunct
#include <iostream>
#include <string>
int main()
{
std::string words = "C Computer C2C C. I like C, because it's C";
std::cout << words << '\n';
// a lambda to check for space or punctuation characters:
auto checker = [](unsigned char ch) {
return std::isspace(ch) || std::ispunct(ch);
};
for(std::string::size_type pos = 0;
(pos = words.find("C", pos)) != std::string::npos;
pos += 1)
{
if( (pos == 0 || checker(words[pos - 1])) &&
(pos + 1 == words.size() || checker(words[pos + 1]))
) {
words.replace(pos, 1, "C++");
}
}
std::cout << words << '\n';
}
Output:
C Computer C2C C. I like C, because it's C
C++ Computer C2C C++. I like C++, because it's C++

How to check if a string contains punctuation c++

I am attempting to iterate over a string to check for punctuation. I've tried to use ispunct() but am receiving an error that there is no matching fucntion for call to ispunct. Is there a better way to implement this?
for(std::string::iterator it = oneWord.begin(); it != oneWord.end(); it++)
{
if(ispunct(it))
{
}
}
Is there a better way to implement this?
Use std::any_of:
#include <algorithm>
#include <cctype>
#include <iostream>
int main()
{
std::string s = "Contains punctuation!!";
std::string s2 = "No puncuation";
std::cout << std::any_of(s.begin(), s.end(), ::ispunct) << '\n';
std::cout << std::any_of(s2.begin(), s2.end(), ::ispunct) << '\n';
}
Live Example
it is an iterator; it points to a character in a string. You have to dereference it to get the thing it points to.
if(ispunct(static_cast<unsigned char>(*it)))

C++ boost replace_all quote

I have a problem with boost::replace_all. My string looks like:
""Date"":1481200838,""Message"":""
And I would like it to look like:
"Date":1481200838,"Message":"
So i would like to replace "" with single ":
boost::replace_all(request_json_str, """", """);
But it doesn't work at all. Same with:
boost::replace_all(request_json_str, "\"\"", "\"");
How could I make this to work?
You need to correctly escape the " character in your call to boost::replace_all!
// Example program
#include <iostream>
#include <string>
#include <algorithm>
#include <boost/algorithm/string/replace.hpp>
int main()
{
std::string msg("\"Date\"\":1481200838,\"\"Message\"\":\"");
boost::replace_all(msg, "\"\"", "\"");
std::cout << msg << std::endl;
}
The boost::replace_all(request_json_str, "\"\"", "\"") already in your answer is the correct way to handle this using boost::replace_all: http://coliru.stacked-crooked.com/a/af7cbc753e16cf4f
I wanted to post an additional answer to say that given auto request_json_str = "\"\"Date\"\":1481200838,\"\"Message\"\":\"\""s the repeated quotations could also be removed without Boost (though not quite so eloquently, using unique, distance, and string::resize):
request_json_str.resize(distance(begin(request_json_str), unique(begin(request_json_str), end(request_json_str), [](const auto& a, const auto& b){ return a == '"' && b == '"'; })));

Weird behavior of std::replace_copy used with std::string

i have this simple code:
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main()
{
string s = "1,0";
string result;
//result.resize(s.length());
replace_copy(s.begin(), s.end(), result.begin(), ',', '.');
cout << '"' << result << '"' << endl;
cout << '"' << result.c_str() << '"' << endl;
cout << result.length() << endl;
return 0;
}
Console output of this program with result.resize line uncommented is:
"1.0"
"1.0"
3
-thats Ok, but when line with result.resize is commented-out, output is :
""
"1.0"
0
-this can leads into weird errors because result != result.c_str() !!!
Can this behavior of replace_copy (and posibly also similar templates) may be considered as error in standard library? I cannot find anything relevant to this subject. Thanks.
Compiler: mingw32-g++ 4.7.1
What did you expect?
Without the resize, there is no space in your string for the new characters.
Trying to copy into that space regardless will definitely result in "weird" [read: undefined] behaviour. You're mangling your memory.
replace_copy copies to a target range, which is not the same as inserting new elements into a target container. The range has to already exist…
… unless you use a back_inserter, which functions as a sort of fake range, that actually performs insertion under the hood:
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main()
{
string s = "1,0";
string result;
//result.resize(s.length()); // Look, ma! No hands!
replace_copy(
s.begin(), s.end(),
std::back_inserter<std::string>(result),
',', '.'
);
cout << '"' << result << '"' << endl;
cout << '"' << result.c_str() << '"' << endl;
cout << result.length() << endl;
}
// "1.0"
// "1.0"
// 3
Live demo
Warning! Getting the right output in that live demo does not prove anything, as undefined behaviour can occasionally "appear to work". However, I have 96k rep and you can trust me. ;)
When you used statement
result.resize(s.length());
you created and initialized (more precisely assigned) the string with three elements with values '\0'. When this statement was not used the string had no relements and the behaviour of the program was undefined.In fact the code with uncommented line is equivalent to the following:
string s = "1,0";
string result( s.length(), '\0' );
replace_copy(s.begin(), s.end(), result.begin(), ',', '.');
If to write as it was with the commented statement then you should use iterator adapter std::back_insert_iteratorFor example
replace_copy(s.begin(), s.end(), std::back_inserter( result ), ',', '.');