I'm trying to read numbers from a string e.g.
if
string str = "1Hi15This10";
I want to get (1,15,10)
I tried by index but I read 10 as 1 and 0 not 10.
I could not use getline because the string is not separated by anything.
Any ideas?
without regex you can do this
std::string str = "1Hi15This10";
for (char *c = &str[0]; *c; ++c)
if (!std::isdigit(*c) && *c != '-' && *c != '+') *c = ' ';
the integers are now seperated by a space delimiter which is trivial to parse
IMHO the best way would be to use a regular expression.
#include <iostream>
#include <iterator>
#include <string>
#include <regex>
int main() {
std::string s = "1Hi15This10";
std::regex number("(\\d+)"); // -- match any group of one or more digits
auto begin = std::sregex_iterator(s.begin(), s.end(), number);
// iterate over all valid matches
for (auto i = begin; i != std::sregex_iterator(); ++i) {
std::cout << " " << i->str() << '\n';
// and additional processing, e.g. parse to int using std::stoi() etc.
}
}
Output:
1
15
10
Live example here on ideone.
Yes, you could just write your own loop for this, but:
you will probably make some silly mistake somewhere,
a regular expression can be adapted to serve many different search patterns; e.g. what if tmr you decide you want to support decimal/negative/floating point numbers; these cases and many others will be easily supported with a regex, probably not so much with your custom solution :)
Related
First of all, I am nothing but new to both programming and Stack Overflow.
I am self-studying with Schaum's outline for Programming with C++ and I have some issues with problem 8.24 (solutions are given to almost every problem in the book, but I want to know why my code in particular isn't working as expected).
You are supposed to be given a c-string and return the given string, but with all its tokens in reverse order (but keeping the natural order of the token itself).
That is, given "Enter a sentence" it would show on screen "sentence a Enter".
My code is the following:
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char line1[100];
cout << "Enter a sentence (enter \".\" to terminate input):\n";
cin.getline(line1,100,'.');
char line2[strlen(line1) + 1]; // we add 1 for the empty char that ends every c string
int char_count = strlen(line1); //strlen() does not include the empty char
char* p = strtok(line1," ");
while (p)
{
char_count -= strlen(p); // we substract p's len to start adding its chars
for (int i = 0; i <= strlen(p); i++)
line2[char_count + i] = p[i]; // we then add the chars themselves
if ((char_count - 1) > 0)
line2[--char_count] = ' '; // a blanck space is needed between the different tokens
p = strtok(NULL, " ");
}
cout << "\n" << line2 << "\n";
}
Unfortunately, the code is wrong in many ways. The most obvious thing is the obscurity of the word reversal process (and the fact it is mixed with word iteration).
According to the commenters, you are not using C++. In C++ it would be rather straightforward:
#include <algorithm>
#include <iostream>
#include <string>
void reverse_words(std::string& s) {
/* starting position of the word */
size_t last_pos = 0;
do {
/* find the end of current word */
size_t end_pos = std::min( s.find(' ', last_pos + 1), s.size() );
/* reverse one word inplace */
std::reverse(s.begin() + last_pos, s.begin() + end_pos);
/* advance to the begining of the next word */
last_pos = end_pos + 1;
} while (pos != std::string::npos);
std::reverse(s.begin(), s.end());
}
int main()
{
std::string s = "This is a sentence";
reverse_words(s);
std::cout << s << std::endl;
}
Hopefully, you can see the essence of the method: sequentially find start and finish of each word, reverse letter order in this word and then finally reverse the entire string.
Now, getting back to the C-string question. You can replace std::string::find call with strtok and write your version of std::reverse specialized for C strings (the reversal of the entire string or its part is simpler than reversing the word order and this is also the recommended exercise).
Start from a simpler program which prints out pairs of integers (start_pos and end_pos for each word) using strtok. Then write a reverse procedure and test it also. Finally, combine this word iteration with reverse. I personally think this is the only way to be sure your implementation is correct - being sure in each of its parts and being able to test each part individually.
A lot of improvements have been added to C++ since that book was originally written, and we can do it in a lot cleaner and safer way now. We'll break the problem into two parts:
A function to convert a string into a list of tokens
The main function, which reads the string; reverses it; and prints it.
These functions will be tokenize, which returns a vector of string_view, and main. A string_view is just a class that stores a pointer and a size to some other string. It's efficient because it won't make a copy of the string or allocate any memory. In this case, it's the right tool for the job because we're going to be breaking up an existing string.
#include <string_view>
#include <string>
#include <iostream>
#include <vector>
#include <algorithm>
auto tokenize(std::string_view line) {
std::vector<std::string_view> tokens;
for (size_t token_size = line.find(' ');
token_size != line.npos;
token_size = line.find(' '))
{
tokens.push_back(line.substr(0, token_size));
line.remove_prefix(token_size + 1);
}
tokens.push_back(line);
return tokens;
}
int main() {
std::string line;
std::getline(std::cin, line);
auto tokens = tokenize(line);
std::reverse(tokens.begin(), tokens.end());
for(auto token : tokens) {
std::cout << token << ' ';
}
std::cout << std::endl;
}
Explaining tokenize
Tokenize takes a string_view as input, and returns a list of the tokens. line.find(' ') will look for a space. If it finds one, it'll return the position of the space; otherwise, it'll return line.npos (which is basically the biggest possible size).
For every token we find, we
get the token via view.substr(0, token_size)
Add the token to the vector via tokens.push_back
Then, we update the line by removing the first token and the corresponding space. This is line.remove_prefix(token_size + 1);
Once there are no more spaces, we'll add the remainder of the line to the vector using tokenize.push_back(line);, and then we'll return the vector of tokens.
Explaining main
We can get the line via std::getline(std::cin, line);, which will read a line from cin and put it in the variable we give it (line). After that, we can read all the tokens in the line using the tokenize function we wrote. We'll reverse the vector of tokens via std::reverse, and then we'll print out all the tokens.
Thanks to each of you.
Seeing your answers I have learnt quite a lot about good programming (both regarding syntax and original ways to solve the problem itself, as Viktor's).
I apologise if I have not given the proper feedback, but again I am (still) unfamiliar with Stack's customs and ''policies''.
Update: Kindly read my comment on jignatius's answer
I wrote the following code to find specific matches in a string using regex and to delete them and replace with another value, but it doesn't work as expected.
For example given the following input:
f={a,b}+{c,d}
I would expect it to delete both {a,b} and {c,d} but it only works on the first one, what is wrong with my code?
After Some checking I can see that the first loop is entered only once, but why?
There is a standard library function, std::regex_replace, in the header <regex> that does what to want to do: text replacement based on a regex. That will simplify things quite a lot for you instead of using a hand crafted loop.
You just need to supply the input string, the regex to match against, and the replacement string:
#include <iostream>
#include <regex>
#include <string>
int main()
{
std::regex reg(R"(\{[^}]*\})");
std::string mystring = "f={a,b}+{c,d}";
auto newstring = std::regex_replace(mystring, reg, "title");
std::cout << newstring; //f=title+title
}
Note: it's also easier to use a raw string literal with the format R"(literal)" to avoid using double backslashes to escape special characters in the regex.
Demo
In your comment you say that the replacement text can change. In that case, you will have to do a loop, not a straight forward regex replace.
You can use std::regex_iterator, a read-only forward iterator that will call std::regex_search() for you. You can use a string stream to build the new string:
#include <iostream>
#include <regex>
#include <string>
#include <sstream>
int main()
{
std::regex reg(R"(\{[^}]*\})");
std::string mystring = "f={a,b}+{c,d} + c";
std::vector<std::string> replacements = { "rep1", "rep2", "rep3" };
int i = 0;
auto start = std::sregex_iterator(mystring.begin(), mystring.end(), reg);
auto end = std::sregex_iterator{};
std::ostringstream ss;
for (std::sregex_iterator it = start; it != end; ++it)
{
std::smatch mat = *it;
ss << mat.prefix() << replacements[i++];
//If last match, stream suffix
if (std::next(it) == end)
{
ss << mat.suffix();
}
}
std::cout << ss.str(); //f=rep1+rep2 + c
}
Note that the prefix() method of the std::smatch object will give you the substring from the target string to the beginning of the match. Then you place your replacement text into the stream. Finally, you should use the suffix() method of the std::smatch object to stream any trailing text between the last match and the end of your target string.
Demo
I'm trying to get a sentence delimited by certain characters (either a space, comma, or a dot) to check if it's a palindrome. If the input is "hello,potato.", I'll check this symmetry on "hello" alone and then potato alone.
The problem is, while I'm doing the first iteration of the loop that searches for the delimiter, the word "hello" is stored in the sub-sentence, but on the second iteration the word that should be stored as "potato" will be "potato.". And I am unable to remove the "." delimiter from the end of the input string.
for(int i=0;i<sentence.length();i++)
{
if(sentence[i]==' '||sentence[i]=='.'||sentence[i]==',')
{ //the couts is just to help me debug/trace
cout<<"i is now : "<<i<<endl;
if(i==delindex && i==sentence.length()-1)
{
subsentence=sentence.substr(temp+1,subsentence.length()-1);
}
else
{
subsentence=sentence.substr(delindex,i);
cout<<subsentence<<endl;
temp=delindex-1;
delindex=i+1;
}
}
}
What would be the best way to go about this?
god bless you man that strtok is what i have been looking for
Actually, you don't need strtok (and should probably avoid it for various safety reasons), as std::string has a wonderful method called find_first_of which acts pretty much like strtok, as in it accepts a bunch of chars and returns index when it stumbles on any of the chars. However to make robust tokenizer a combination of find_first_of and find_first_not_of is more suitable in this case.
Therefore you could simplify your token searching to:
#include <iostream>
#include <string>
int main()
{
std::string sentence = "hello,potato tomato.";
std::string delims = " .,";
size_t beg, pos = 0;
while ((beg = sentence.find_first_not_of(delims, pos)) != std::string::npos)
{
pos = sentence.find_first_of(delims, beg + 1);
std::cout << sentence.substr(beg, pos - beg) << std::endl;
}
}
https://ideone.com/rhMyvG
I got a string and I want to remove all the punctuations from it. How do I do that? I did some research and found that people use the ispunct() function (I tried that), but I cant seem to get it to work in my code. Anyone got any ideas?
#include <string>
int main() {
string text = "this. is my string. it's here."
if (ispunct(text))
text.erase();
return 0;
}
Using algorithm remove_copy_if :-
string text,result;
std::remove_copy_if(text.begin(), text.end(),
std::back_inserter(result), //Store output
std::ptr_fun<int, int>(&std::ispunct)
);
POW already has a good answer if you need the result as a new string. This answer is how to handle it if you want an in-place update.
The first part of the recipe is std::remove_if, which can remove the punctuation efficiently, packing all the non-punctuation as it goes.
std::remove_if (text.begin (), text.end (), ispunct)
Unfortunately, std::remove_if doesn't shrink the string to the new size. It can't because it has no access to the container itself. Therefore, there's junk characters left in the string after the packed result.
To handle this, std::remove_if returns an iterator that indicates the part of the string that's still needed. This can be used with strings erase method, leading to the following idiom...
text.erase (std::remove_if (text.begin (), text.end (), ispunct), text.end ());
I call this an idiom because it's a common technique that works in many situations. Other types than string provide suitable erase methods, and std::remove (and probably some other algorithm library functions I've forgotten for the moment) take this approach of closing the gaps for items they remove, but leaving the container-resizing to the caller.
#include <string>
#include <iostream>
#include <cctype>
int main() {
std::string text = "this. is my string. it's here.";
for (int i = 0, len = text.size(); i < len; i++)
{
if (ispunct(text[i]))
{
text.erase(i--, 1);
len = text.size();
}
}
std::cout << text;
return 0;
}
Output
this is my string its here
When you delete a character, the size of the string changes. It has to be updated whenever deletion occurs. And, you deleted the current character, so the next character becomes the current character. If you don't decrement the loop counter, the character next to the punctuation character will not be checked.
ispunct takes a char value not a string.
you can do like
for (auto c : string)
if (ispunct(c)) text.erase(text.find_first_of(c));
This will work but it is a slow algorithm.
Pretty good answer by Steve314.
I would like to add a small change :
text.erase (std::remove_if (text.begin (), text.end (), ::ispunct), text.end ());
Adding the :: before the function ispunct takes care of overloading .
The problem here is that ispunct() takes one argument being a character, while you are trying to send a string. You should loop over the elements of the string and erase each character if it is a punctuation like here:
for(size_t i = 0; i<text.length(); ++i)
if(ispunct(text[i]))
text.erase(i--, 1);
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main() {
string str = "this. is my string. it's here.";
transform(str.begin(), str.end(), str.begin(), [](char ch)
{
if( ispunct(ch) )
return '\0';
return ch;
});
}
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;//string is defined here.
cout << "Please enter a string with punctuation's: " << endl;//Asking for users input
getline(cin, s);//reads in a single string one line at a time
/* ERROR Check: The loop didn't run at first because a semi-colon was placed at the end
of the statement. Remember not to add it for loops. */
for(auto &c : s) //loop checks every character
{
if (ispunct(c)) //to see if its a punctuation
{
c=' '; //if so it replaces it with a blank space.(delete)
}
}
cout << s << endl;
system("pause");
return 0;
}
Another way you could do this would be as follows:
#include <ctype.h> //needed for ispunct()
string onlyLetters(string str){
string retStr = "";
for(int i = 0; i < str.length(); i++){
if(!ispunct(str[i])){
retStr += str[i];
}
}
return retStr;
This ends up creating a new string instead of actually erasing the characters from the old string, but it is a little easier to wrap your head around than using some of the more complex built in functions.
I tried to apply #Steve314's answer but couldn't get it to work until I came across this note here on cppreference.com:
Notes
Like all other functions from <cctype>, the behavior of std::ispunct
is undefined if the argument's value is neither representable as
unsigned char nor equal to EOF. To use these functions safely with
plain chars (or signed chars), the argument should first be converted
to unsigned char.
By studying the example it provides, I am able to make it work like this:
#include <string>
#include <iostream>
#include <cctype>
#include <algorithm>
int main()
{
std::string text = "this. is my string. it's here.";
std::string result;
text.erase(std::remove_if(text.begin(),
text.end(),
[](unsigned char c) { return std::ispunct(c); }),
text.end());
std::cout << text << std::endl;
}
Try to use this one, it will remove all the punctuation on the string in the text file oky.
str.erase(remove_if(str.begin(), str.end(), ::ispunct), str.end());
please reply if helpful
i got it.
size_t found = text.find('.');
text.erase(found, 1);
I have a std::string and wish for the first letter to be capitalized and the rest lower case.
One way I could do this is:
const std::string example("eXamPLe");
std::string capitalized = boost::to_lower_copy(example);
capitalized[0] = toupper(capitalized[0]);
Which would yield capitalized as:
"Example"
But perhaps there is a more straight forward way to do this?
If the string is indeed just a single word, std::string capitalized = boost::locale::to_title (example) should do it. Otherwise, what you've got is pretty compact.
Edit: just noticed that the boost::python namespace has a str class with a capitalize() method which sounds like it would work for multi word strings (assuming you want what you described and not title case). Using a python string just to gain that functionality is probably a bad idea, however.
A boost-less solution is:
#include <iostream>
#include <string>
#include <algorithm>
int main()
{
const std::string example("eXamPLe");
std::string s = example;
s[0] = toupper(s[0]);
std::transform(s.begin()+1, s.end(), s.begin()+1, tolower);
std::cout << s << "\n";
}
I think the string variable name is example and the string stored in it is "example".
So try this:
example[0] = toupper(example[0]);
for(int i=1 ; example[i] != '\0' ; ++i){
example[i] = tolower(example[i]);
}
cout << example << endl;
This might give you the first character CAPITALIZED and the rest of the string becomes lowercase.
It's not quite different from the original solution but just a different approach.