Extract only and all digits from a string [duplicate] - regex

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a long string. What is the regular expression to split the numbers into the array?

Are you removing or splitting? This will remove all the non-numeric characters.
myStr = myStr.replaceAll( "[^\\d]", "" )

One more approach for removing all non-numeric characters from a string:
String newString = oldString.replaceAll("[^0-9]", "");

String str= "somestring";
String[] values = str.split("\\D+");

Another regex solution:
string.replace(/\D/g,''); //remove the non-Numeric
Similarly, you can
string.replace(/\W/g,''); //remove the non-alphaNumeric
In RegEX, the symbol '\' would make the letter following it a template: \w -- alphanumeric, and \W - Non-AlphaNumeric, negates when you capitalize the letter.

You will want to use the String class' Split() method and pass in a regular expression of "\D+" which will match at least one non-number.
myString.split("\\D+");

Java 8 collection streams :
StringBuilder sb = new StringBuilder();
test.chars().mapToObj(i -> (char) i).filter(Character::isDigit).forEach(sb::append);
System.out.println(sb.toString());

This works in Flex SDK 4.14.0
myString.replace(/[^0-9&&^.]/g, "");

you could use a recursive method like below:
public static String getAllNumbersFromString(String input) {
if (input == null || input.length() == 0) {
return "";
}
char c = input.charAt(input.length() - 1);
String newinput = input.substring(0, input.length() - 1);
if (c >= '0' && c<= '9') {
return getAllNumbersFromString(newinput) + c;
} else {
return getAllNumbersFromString(newinput);
}
}

Previous answers will strip your decimal point. If you want to save your decimal, you might want to
String str = "My values are : 900.00, 700.00, 650.50";
String[] values = str.split("[^\\d.?\\d]");
// split on wherever they are not digits except the '.' decimal point
// values: { "900.00", "700.00", "650.50"}

Simple way without using Regex:
public static String getOnlyNumerics(String str) {
if (str == null) {
return null;
}
StringBuffer strBuff = new StringBuffer();
char c;
for (int i = 0; i < str.length() ; i++) {
c = str.charAt(i);
if (Character.isDigit(c)) {
strBuff.append(c);
}
}
return strBuff.toString();
}

Related

How to match a string to regex based on a certain index? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am working on a small project of mine, and I need to match a string to a regex value, and the first character of the match needs to start exactly at a certain index.
I need to do this in C++, but can't find a method or function that seems like it would work on cplusplus.com.
Eg. if I put in the string Stack overflow into the pattern f.ow
and an index of 3, it should return false. But if I set the index to 10, it should return true and allow me to find out what actually matched (flow). And if I put in an index of 11, it should also return false.
Try this function, hope this will work for you
#include<iostream>
#include<regex>
#include<string.h>
using namespace std;
bool exactRegexMatch(string str,int index){
regex reg("f(.)ow");
return regex_match(str.substr(index), reg);
}
int main(){
if(exactRegexMatch("Stack Overflow",10)){
cout<<"True"<<endl;
}
else{
cout<<"False"<<endl;
}
}
How to match a string to regex based on a certain index?
Pass the substring starting from the index as the string argument, and add ^ to the beginning of the regex so that it only matches when the pattern starts from the beginning of the substring.
This uses the whole string by constructing the regex.
strRx = `"^[\\S\\s]{" + strOffset + "}(f.ow)";
// make a regex out of strRx
or, use iterators to jump to the location to start matching from
bool FindInString( std::string& sOut, std::string& sInSrc, int offset )
{
static std::regex Rx("(f.ow)");
sOut = "";
bool bRet = false;
std::smatch _M;
std::string::const_iterator _Mstart = sInSrc.begin() + offset;
std::string::const_iterator _Mend = sInSrc.end();
if ( regex_search( _Mstart, _Mend, _M, Rx ) )
{
// Check that it matched at exactly the _Mstart position
if ( _M[1].first == _Mstart )
{
sOut = std::string(_M[1].first, _M[1].second);
bRet = true;
}
}
return bRet;
}

How to update regex to allow empty value or alphanumeric only

I'm trying to modify a regex so it will allow an empty value or alphanumeric only.
I currently have this, but it only validates the alphanumeric
if (ruletype eq "alphanumeric") {
bMatch = true;
variables.fieldName = listGetAt(arguments.rules[nRow],2,",");
if (structKeyExists(arguments.form, "#variables.fieldName#")){
if (NOT RefindNoCase("[[:alnum:]]",arguments.form[variables.fieldName])) {
lstError = listAppend(lstError,nRow,",");
}
} else {
lstError = listAppend(lstError,nRow,",");
}
}
I tried converting to rematch to find the empty value, but that also accepts the value 1234^%^&& which contains special characters. I'm not sure how to fix that.
Do I understand correctly, from the value you mention, that arguments.form[variables.fieldName] is a comma-delimited list? If so, then what is to be matched is each list-item.(Incidentally, the # in sdkfk364563!##$% has to be delimited).
A possible answer is then:
if (structKeyExists(arguments.form, variables.fieldName)){
// Assuming arguments.form[variables.fieldName] is a comma-delimited list
fieldNameArray=listToArray(arguments.form[variables.fieldName], ',', true);
for (fieldValue in fieldNameArray) {
fieldValue=trim(fieldValue);
if (fieldValue eq "" or REfindNoCase("^[a-zA-Z0-9]*$",fieldValue) eq 0) {
lstError = listAppend(lstError,nRow);
}
}
}
[[:alnum:]] is POSIX syntax, which may not be supported. Use the universal ASCII syntax, [a-zA-Z0-9]. Also modify your code to account for the presence of an integer and to rule out any possible space character.
if (structKeyExists(arguments.form, variables.fieldName)){
if (REfindNoCase("^[a-zA-Z0-9]*$",trim(arguments.form[variables.fieldName])) eq 0) {
lstError = listAppend(lstError,nRow);
}
}

How to use regex in C++? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a string like below:
std::string myString = "This is string\r\nIKO\r\n. I don't exp\r\nO091\r\nect some characters.";
Now I want to get rid of the characters between \r\n including \r\n.
So the string must look like below:
std::string myString = "This is string. I don't expect some characters.";
I am not sure, how many \r\n's going to appear.
And I have no idea what characters are coming between \r\n.
How could I use regex in this string?
Personally, I'd do a simple loop with find. I don't see how using regular expressions helps much with this task. Something along these lines:
string final;
size_t cur = 0;
for (;;) {
size_t pos = myString.find("\r\n", cur);
final.append(myString, cur, pos - cur);
if (pos == string::npos) {
break;
}
pos = myString.find("\r\n", pos + 2);
if (pos == string::npos) {
// Odd number of delimiters; handle as needed.
break;
}
cur = pos + 2;
}
Regular expressions are "greedy" by default in most regex libaries.
Just make your regex say "\r\n.*\r\n" and you should be fine.
EDIT
Then split your input using the given regex. That should yield two strings you can combine into the desired result.

C++ How to get string/char in between 2 words

i got a word that is
AD#Andorra
Got a few questions:
How do i check
AD?Andorra exist
? is a wildcard, it could be comma or hex or dollar sign or other value
then after confirm AD?Andorra exist, how do i get the value of ?
Thanks,
Chen
The problem can be solved generally with a regular expression match. However, for the specific problem you presented, this would work:
std::string input = getinput();
char at2 = input[2];
input[2] = '#';
if (input == "AD#Andorra") {
// match, and char of interest is in at2;
} else {
// doesn't match
}
If the ? is supposed to represent a string also, then you can do something like this:
bool find_inbetween (std::string input,
std::string &output,
const std::string front = "AD",
const std::string back = "Andorra") {
if ((input.size() < front.size() + back.size())
|| (input.compare(0, front.size(), front) != 0)
|| (input.compare(input.size()-back.size(), back.size(), back) != 0)) {
return false;
}
output = input.substr(front.size(), input.size()-front.size()-back.size());
return true;
}
If you are on C++11/use Boost (which I strongly recommend!) use regular expressions. Once you gain some level of understanding all text processing becomes easy-peasy!
#include <regex> // or #include <boost/regex>
//! \return A separating character or 0, if str does not match the pattern
char getSeparator(const char* str)
{
using namespace std; // change to "boost" if not on C++11
static const regex re("^AD(.)Andorra$");
cmatch match;
if (regex_match(str, match, re))
{
return *(match[1].first);
}
return 0;
}
assuming your character always starts at position 3!
use the string functions substr:
your_string.substr(your_string,2,1)
If you are using C++11, i recommend you to use regex instead of direct searching in your string.

C++ split string with space and punctuation chars

I wanna split an string using C++ which contains spaces and punctuations.
e.g. str = "This is a dog; A very good one."
I wanna get "This" "is" "a" "dog" "A" "very" "good" "one" 1 by 1.
It's quite easy with only one delimiter using getline but I don't know all the delimiters. It can be any punctuation chars.
Note: I don't wanna use Boost!
Use std::find_if() with a lambda to find the delimiter.
auto it = std::find_if(str.begin(), str.end(), [] (const char element) -> bool {
return std::isspace(element) || std::ispunct(element);})
So, starting at the first position, you find the first valid token. You can use
index = str.find_first_not_of (yourDelimiters);
Then you have to find the first delimiter after this, so you can do
delimIndex = str.substr (index).find_first_of (yourDelimiters);
your first word will then be
// since delimIndex will essentially be the length of the word
word = str.substr (index, delimIndex);
Then you truncate your string and repeat. You have to, of course, handle all of the cases where find_first_not_of and find_first_of return npos, which means that character was/was not found, but I think that's enough to get started.
Btw, I'm not claiming that this is the best method, but it works...
vmpstr's solution works, but could be a bit tedious.
Some months ago, I wrote a C library that does what you want.
http://wiki.gosub100.com/doku.php?id=librerias:c:cadenas
Documentation has been written in Spanish (sorry).
It doesn't need external dependencies. Try with splitWithChar() function.
Example of use:
#include "string_functions.h"
int main(void){
char yourString[]= "This is a dog; A very good one.";
char* elementsArray[8];
int nElements;
int i;
/*------------------------------------------------------------*/
printf("Character split test:\n");
printf("Base String: %s\n",yourString);
nElements = splitWithChar(yourString, ' ', elementsArray);
printf("Found %d element.\n", nElements);
for (i=0;i<nElements;i++){
printf ("Element %d: %s\n", i, elementsArray[i]);
}
return 0;
}
The original string "yourString" is modified after use spliWithChar(), so be carefull.
Good luck :)
CPP, unlike JAVA doesn't provide an elegant way to split the string by a delimiter. You can use boost library for the same but if you want to avoid it, a manual logic would suffice.
vector<string> split(string s) {
vector<string> words;
string word = "";
for(char x: s) {
if(x == ' ' or x == ',' or x == '?' or x == ';' or x == '!'
or x == '.') {
if(word.length() > 0) {
words.push_back(word);
word = "";
}
}
else
word.push_back(x);
}
if(word.length() > 0) {
words.push_back(word);
}
return words;